Technical AI safety

4 proposals

29 active projects

$1.71M

Proposals

Active Projects

Remmelt Ellen

10th edition of AI Safety Camp

Technical AI safety AI governance

$57.4K

Kabir Kumar

AI-Plans.com

Science & technology Technical AI safety AI governance

$5.37K

Apart Research

Help Apart Expand Global AI Safety Research

Incubate AI safety research and develop the next generation of global AI safety talent via research sprints and research fellowships

Science & technology Technical AI safety AI governance EA community Global catastrophic risks

$5.11K

🐶

Alexander Pan

Removing Hazardous Knowledge from AIs

Technical AI safety

$190K

Kunvar Thaman

Exploring feature interactions in transformer LLMs through sparse autoencoders

Technical AI safety

$8.5K

Zhonghao He

Mapping neuroscience and mechanistic interpretability

Surveying neuroscience for tools to analyze and understand neural networks and building a natural science of deep learning

Technical AI safety

$5.95K

Ryan Kidd

MATS Funding

Help us support more research scholars!

Technical AI safety

$270K

🥦

Dusan D Nesic

AI Safety Serbia Hub - Office Space for (Frugal) AI Safety Researchers

Free/Subsidized/Cheap office space outside of EU but in good timezones with favorable visa policies (especially for Chinese/Russian but also other citizens).

Technical AI safety AI governance EA community

$2.35K

Lawrence Chan

Exploring novel research directions in prosaic AI alignment

3 month

Technical AI safety

$30K

Luan Rafael Marques de Oliveira

Translation of BlueDot Impact's AI alignment curriculum into Portuguese

Support to translate BlueDot Impact’s AI alignment curriculum into (Br) Portuguese to be used in university study groups and an online course

Science & technology Technical AI safety EA community Global catastrophic risks

$3K

🍉

Cadenza Labs

Cadenza Labs: AI Safety research group working on own interpretability agenda

We're a team of SERI-MATS alumni working on interpretability, seeking funding to continue our research after our LTFF grant ended.

Science & technology Technical AI safety Global catastrophic risks

$7.71K

Robert Krzyzanowski

Scaling Training Process Transparency

Compute and infrastructure costs

Technical AI safety

$5.15K

🥑

Apollo Research

Apollo Research: Scale up interpretability & behavioral model evals research

Hire 3 additional AI safety research engineers / scientists

Technical AI safety

$325K

Jesse Hoogland

Scoping Developmental Interpretability

6-month funding for a team of researchers to assess a novel AI alignment research agenda that studies how structure forms in neural networks

Technical AI safety

$145K

Unfunded Projects

🐬

Jaeson Booker

Research and engineering multi-agent alignment

Skilling up on interpretability and multi-agent alignment

Technical AI safety

Allison Duettmann

Increasing the funding distributed by Foresight Insitute's AI safety grants

focused on 1. bci and wbe for safe ai, 2. cryptography and security for safe ai, and 3. safe multipolar ai

Science & technology Technical AI safety AI governance

Technical AI safety

Proposals

Investigating constructability as a safer approach to machine-learning

Interpretability Research: Publishing Scale-Consistent Sparse Autoencoders

Gov't Action Kit

Steganography via RL

Active Projects

10th edition of AI Safety Camp

AI-Plans.com

Help Apart Expand Global AI Safety Research

Removing Hazardous Knowledge from AIs

Exploring feature interactions in transformer LLMs through sparse autoencoders

Mapping neuroscience and mechanistic interpretability

MATS Funding

AI Safety Serbia Hub - Office Space for (Frugal) AI Safety Researchers

Exploring novel research directions in prosaic AI alignment

Translation of BlueDot Impact's AI alignment curriculum into Portuguese

Cadenza Labs: AI Safety research group working on own interpretability agenda

Scaling Training Process Transparency

Apollo Research: Scale up interpretability & behavioral model evals research

Scoping Developmental Interpretability

Unfunded Projects

Research and engineering multi-agent alignment

Increasing the funding distributed by Foresight Insitute's AI safety grants