I think Epoch has done truly outstanding work on core trends in AI progress in the past few years. I'm also excited by their recent foray into benchmarking in the form of FrontierMath. I think highly of core team members involved in the project. I found our initial discussions about this project very promising.

Better benchmarks that help us forecast time to AGI (and especially time to relevant capabilities, such as automated AI research) and do so in a highly credible and scientific way are very valuable for informing policymakers and catalyzing important policy efforts.

Donor's main reservations

It's a pilot, it might not work.

Epoch has other funding—but not for this effort, and benchmarking is especially expensive (API calls, labelers).

Process for deciding amount

I reviewed a proposed budget. (Confidential, more on request from Manifund.)

Conflicts of interest

Please disclose e.g. any romantic, professional, financial, housemate, or familial relationships you have with the grant recipient(s).

No COIs.

Compute and other expenses for LLM alignment research

Leopold Aschenbrenner

about 1 year ago

Update Oct 24: the projects have been making exciting progress! There's more work to do, so I'm granting my remaining $200k to support it. Really excited about this!

Update from Ethan:
Some updates on how the last funding was used, and what future funding would be used for:

With the help of your last grant, two of our projects finished up and turned into (IMO) pretty exciting ICLR submissions -- one on investigating whether human feedback is at fault for sycophancy in language models, and another on extending RLHF to vision-and-language models (in the hopes of facilitating process-based training of vision-and-language models like GPT4+)
I've still got 3 projects in flight:

For one of these, we've gotten to the point where we've found a way to improve how representative chain of thought reasoning is of the model's actual process for solving tasks, which I think will be pretty helpful for both improving model transparency and also process-based training schemes; we'll probably have a paper on this in ~2 months
The other 2 projects (debate + model organisms of reward hacking) are in-flight and making good progress, and I'm optimistic that we'll have some interesting public results out in the 4 month timeframe (we already have some results that are interesting to discuss publicly, but probably want to do more work before starting to put together a paper)

I might start up new projects with winter MATS or other external-to-Anthropic collaborators, all of these could benefit from funding for OAI API credits
Our current runway is ~6 weeks, and we expect our compute expenses to go up a bit since we're slated to run compute-intensive experiments for the debate project

Compute and other expenses for LLM alignment research

Leopold Aschenbrenner

over 1 year ago

Ethan Perez is a kickass researcher whom I really respect, and he also just seems very competent at getting things done. He is mentoring these projects, and these are worthwhile empirical research directions in my opinion. The MATs scholars are probably pretty junior, so a lot of the impact might be upskilling, but Ethan also seems really bullish on the projects, which I put a lot of weight on. I'm excited to see more external empirical alignment research like this!

Ethan reached out to me a couple days ago saying they were majorly bottlenecked on compute/API credits; it seemed really high-value to unblock them, and high-value to unblock them quickly. I'm really excited that Manifund regranting exists for this purpose!

Note: I may want to give further funds to support these projects in the future; this should cover them for ~a couple months. I'm trying to see if we can get more API credits via OpenAI to cover some of this first before committing additional funding.

Transactions

For	Date	Type	Amount
Pilot for new benchmark by Epoch AI	2 days ago	project donation	200000
Manifund Bank	9 months ago	deposit	+250000
Compute and other expenses for LLM alignment research	about 1 year ago	project donation	200000
Compute and other expenses for LLM alignment research	over 1 year ago	project donation	200000
<8c5d3152-ffd8-4d0e-b447-95a31f51f9d3>	over 1 year ago	profile donation	+10
Manifund Bank	over 1 year ago	deposit	+400000