I think this project offers an efficient way for researchers to compare alignment plans, and for observers to see the progress towards solving the Alignment Problem.
AI-Plans.com is an open platform, and we strive to make the site a living peer review of AI Alignment plans. We currently have over 180 alignment plans on the site, with users regularly posting more.
Several AI alignment researchers at research institutions and labs (e.g., DeepMind, xAI, Berkeley, MIT, Cambridge, and more) have actively expressed interest in the site and many independent researchers have joined. They’ve used it to find valuable papers, submit plans for feedback, or give us feedback for improvements.
We’ve also held Critique-a-Thons judged by experts and have received 110+ submissions so far. Submissions are anonymized so judges evaluate critiques on merit alone. Student societies from the University of Edinburgh, University of Warwick, and University of Victoria will be joining the next Critique-a-Thon.
We’re seeking $10,000 to pay developers for improvements/maintenance and fund future Critique-a-Thons.
We aim to drastically improve the rate of Alignment research.
We will do this by:
Having a platform where AI Alignment plans are ranked from top to bottom from the ones with the best strengths and least vulnerabilities, to the ones with the most vulnerabilities.
Making it easier to give high quality feedback on AI Alignment plans.
Making it easier to judge the quality of an AI Alignment plan.
Most of the $10,000 fund will go to paying developers and our team to maintain and update the site and conduct outreach.
The rest of the fund will be used as cash prizes for Critique-a-Thons. Cash prizes incentivize more participants to join and submit high-quality critiques.
Our team includes a developer and a QA, both with several years of experience, and other helpers.
We have the advisement of the following AI experts:
Dr. Peter S. Park: Harvard Mathematics Ph.D. and MIT postdoc at the Tegmark lab
Charbel-Raphaël Segerie: Head of the AI Unit at EffiSciences, leader of the Turing Seminar, and leader of the ML4Good bootcamp
Dr. Linda Linsefors: Co-founder of AI Safety Camp and AI Safety Support
Dr. Seth Herd: Research Associate at Cognitive Psychology and Cognitive Neuroscience at CU-Boulder. Dr. Herd was also a participant in our first Critique-a-Thon.
To date, we’ve had two Critique-a-Thons. The first one had over 40 submissions, and the 2nd had over 70. Both have produced several highly refined critiques for AI Alignment plans.
I myself am in touch with researchers from a lot of labs and institutions, including Anthropic, DeepMind, Berkley, MIT, Cambridge and Oxford. So far they've all been enthusiastic about the project.
Chances are unlikely that the project will fail. We’ve garnered higher visibility in AI with researchers and educational institutions. Our Critique-a-Thon submissions have grown by 75% with each round.
The most likely cause of failure would be a key team member with a sudden personal health crisis.
We’ve received $500 from AI Safety Strategy for the first Critique-a-Thon.
William MF Baird
12 months ago
I think this project offers an efficient way for researchers to compare alignment plans, and for observers to see the progress towards solving the Alignment Problem.