Overview

The unprecedented capabilities of today’s large-scale machine learning models and AI agents have introduced novel safety and security risks, including prompt-injection attacks, capability overreach, unintended emergent behaviors, and cascading system failures. While landmark regulations like the EU AI Act, the first comprehensive AI law establishing a risk-based classification and mandatory requirements for general-purpose models, have begun to address transparency, human oversight, and prohibition of unacceptable uses, significant gaps remain in covering safety and security throughout the training and deployment pipeline of powerful AI systems.

At the same time, the International AI Safety Report 2025 synthesizes over 100 expert contributions on AI risks (e.g., malicious use, malfunctions, and systemic threats) and highlights deep uncertainty in AI’s trajectory and the urgent need for evidence-based mitigation strategies. Moreover, technical AI safety research has identified both cooperation opportunities and new vulnerabilities in large-scale model deployment; for example, international collaborations may help develop shared verification protocols, but also risk leaking sensitive capabilities or introducing backdoors. Despite these efforts, there are still considerable gaps in the safety of state-of-the-art models, where recent works highlight several failure cases of state-of-the-art LLMs and Agents. For instance, internal red-team evaluations of the latest Claude Opus 4 showed that, when prompted by inexperienced users, the model can generate step-by-step instructions for creating biological agents and, during its structured shutdown threat tests, occasionally attempted to “hijack” strategies (e.g., threatening to leak internal secrets) to avoid being turned off. These gaps and tensions have been further exacerbated by the advent of AI workflows and Agents.

The main goal of this workshop is to bridge the gap between state-of-the-art ML safety/security research and evolving regulatory frameworks.

Please check out our Call for Papers. We invite researchers, practitioners, and community members to serve as reviewers for the workshop, detailed information can be found in the reviewer application form.

Important Dates:

Keynote Talks

TBD

Portrait

Yoshua Bengio

TBD

Université de Montréal

TBD

Portrait

Yarin Gal

TBD

Oxford University

TBD

Portrait

Dan Hendrycks

TBD

Center for AI Safety

TBD

Portrait

Sara Hooker

TBD

Cohere Labs

TBD

Portrait

Gary Howarth

TBD

NIST

TBD

Portrait

Bo Li

TBD

University of Illinois Urbana-Champaign

TBD

Portrait

Lucilla Sioli

TBD

EU AI Office

Panel Discussion

TBD

TBD


   

Schedule

TBD

TimeEventSpeaker/Details
TBDTBDSchedule will be announced soon

Past Editions

Organizers

If you have any questions, please contact us via the following email: regulatableml25@googlegroups.com.

Core Organizing Team

Chirag Agarwal

Chirag Agarwal

University of Virginia
 

Hima Lakkaraju

Hima Lakkaraju

Harvard University
 

Jiaqi Ma

Jiaqi Ma

University of Illinois
Urbana-Champaign

Sarah Tan

Sarah Tan

Salesforce
Cornell University

Student Organizers

Junwei Deng

Junwei Deng

University of Illinois
Urbana-Champaign

Pingbang Hu

Pingbang Hu

University of Illinois
Urbana-Champaign

Eileanor LaRocco

Eileanor LaRocco

University of Virginia
 

Karolina Naranjo

Karolina Naranjo

University of Virginia
 

Shichang Zhang

Shichang Zhang

Harvard University