Jacob Steinhardt
Alex Leader
Measuring whether AI can autonomously execute multi-stage cyberattacks to inform deployment decisions at frontier labs
David Krueger
Our mission is to inform and organize the public to confront societal-scale risks of AI, and put an end to the reckless race to develop superintelligent AI.
Joseph E Brown
A constraint-first approach to ensuring non-authoritative, fail-closed behavior in large language models under ambiguity and real-world pressure
Amrit Sidhu-Brar
Research on how to navigate the transition to a world with superintelligent AI systems
Lawrence Wagner
Mackenzie Conor James Clark
An open-source framework for detecting and correcting agentic drift using formal metrics and internal control kernels
Xyra Sinclair
Unlocking the paradigm of agents + SQL + compositional vector search
Anthony Ware
Identifying operational bottlenecks and cruxes between alignment proposals and executable governance.
Mirco Giacobbe
Developing the software infrastructure to make AI systems safe, with formal guarantees
Gergő Gáspár
Help us solve the talent and funding bottleneck for EA and AIS.
Will Shin
A global IP project reimagining ecology and future technology and institutions through character-driven narratives.
Evžen Wybitul
Centre pour la Sécurité de l'IA
Leveraging 12 Nobel signatories to harmonize lab safety thresholds and secure an international agreement during the 2026 diplomatic window.
Chris Canal
Enabling rapid deployment of specialized engineering teams for critical AI safety evaluation projects worldwide
Muhammad Ahmad
A pilot to build policy and technical capacity for governing high-risk AI systems in Africa
Sandy Tanwisuth
We reframe the alignment problem as the problem of governing meaning and intent when they cannot be fully expressed.
David Rozado
An Integrative Framework for Auditing Political Preferences and Truth-Seeking in AI Systems
Brian McCallion
A mechanistic, testable framework explaining LLM failure modes via boundary writes and attractor dynamics
Christopher Kuntz
A bounded protocol audit and implementation-ready mitigation for intent ambiguity and escalation in deployed LLM systems.