A donor has sent another $10k, which will partly fund the 2025 edition.
Shallow review of AI safety 2024
Project summary
Last year I and a collaborator summarised every live project in AI safety, tried to understand their theories of change, listed outputs, personnel, and funding amounts, and wrote an editorial.
We talked to a couple dozen researchers to check our glosses and get their views. The post was well-received (100 karma on AF, which is very rare) and is e.g. a standard intro resource at 80k. We did it pro bono (or rather, failed to obtain retroactive funding).
We want to update the review for 2024: progress, shutdowns, trends, and our takes.
What are this project's goals? How will you achieve them?
The original goal was to help new researchers orient and know their options, to help everyone understand where things stand, and to help funders see quickly what has already been funded. Simply putting all links in one place was perhaps half of the value.
This iteration: same as above but incorporating last year's feedback and seeking to get sign-off from more than 50% of those covered. Also a professionalised version suitable for policy audiences.
$8K: bare bones update (80 hours). Skim everything, reuse the taxonomy and seek correction in the comments.
$13K: much more effort on verifying details and seeking out consensus, more editorial and synthesis
$17K: section on academic and outgroup efforts. Add a glossy formal report optimised for policy people.
How will this funding be used?
Wages.
Who is on your team? What's your track record on similar projects?
Gavin and Stag did last year's version. Stephen is the source of much of the (limited) descriptive statistics about the field.
We ran this project last year, and it was well-received. Habryka: "I think overall this post did a pretty good job of a lot of different work happening in the field. I don't have a ton more to say, I just think posts like this should come out every few months, and the takes in this one overall seemed pretty good to me."
What are the most likely causes and outcomes if this project fails?
N/A
How much money have you raised in the last 12 months, and from where?
$0 so far.
Austin Chen
4 days ago
Manifund has now received @cfalls's $10k donation for your project and added it to this page!
Austin Chen
14 days ago
Approving this project! As I wrote for the Manifund blog:
Gavin Leech is a forecaster, researcher and founder of Arb; he’s proposing to re-rerun a 2023 survey of AI Safety. The landscape shifts pretty quickly, so I’d love to see what’s changed since last year.
I'm especially glad to see that others including Ryan, Anton, and Matt of OpenPhil are also excited to fund this.
(I've also updated the funding limit to indicate that Gavin's funding needs have been met)
Matt Putz
15 days ago
I work at Open Philanthropy, and I recently let Gavin know that Open Phil is planning to recommend a grant of $5k to Arb for this project (they had already raised ~$10k by the time we came across it).
Like others here, I believe this overview is a valuable reference for the field, especially for newcomers.
I wanted to flag that this project would have been eligible for our RFP for work that builds capacity to address risks from transformative AI. I worry that not all potential applicants are aware of the RFP or its scope, so I’ll take this opportunity to mention that this RFP’s scope is quite broad, including funding for:
Training and mentorship programs
Events
Groups
Resources, media, and communications
Almost any other type of project that builds capacity for advanced AI risks (in the sense of increasing the number of careers devoted to these problems, supporting people doing this work, and sharing knowledge related to this work).
More details at the link above. People might also find this page helpful, which lists all currently open application programs at Open Phil.
Thanks to Austin, whose EA Forum post brought this funding opportunity to our attention.
Nickolai Leschov
17 days ago
I appreciate your work summarizing every live project in AI safety and would love to see it thoroughly updated for 2024.
Gavin Leech
17 days ago
Thanks very much to all donors! A private donor has offered to fill the difference so please stop sending me money (mods, if there's a way to close projects I can't see it). We've started work.
Anton Makiievskyi
22 days ago
I appreciated the previous iterations of this review. Trying to encourage more of the same
Neel Nanda
23 days ago
I think collections like this add significant value to newcomers to the field, mostly by being a list of all areas worth maybe thinking about, and key links (rather than eg by providing a lot of takes on which areas are more or less important, unless the author has excellent taste). Gavin has convinced me that the previous post gets enough traffic for it be valuable to be kept up to date.
I'm not super convinced that a ton has changed since 2023, but enough has to be worth at least some updating, so I'm funding the MVP version (I expect this to have more errors than higher funding, but for these to largely be found in the comments, and even higher funding would still have errors). I'd be fine to see others funding it higher though