trishume avatar
Tristan Hume

@trishume

regrantor

Interpretability at Anthropic

https://thume.ca/

Donate

This is a donation to this user's regranting budget, which is not withdrawable.

Sign in to donate
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

Developer and interpretability researcher at Anthropic

Outgoing donations

Comments

trishume avatar

Tristan Hume

about 1 year ago

I've been very impressed with the MATS program. Lots of impressive people have gotten into and connected through their program and when I've visited I've been impressed with the caliber of people I met.

An example is Marius Hobbhahn doing interpretability research during MATS that helped inform the Anthropic interpretability team's strategy, and then Marius going on to co-found Apollo.

trishume avatar

Tristan Hume

over 1 year ago

I'm very excited about Apollo based on a combination of the track record of it's founding employees and the research agenda they've articulated.

[Marius](https://www.alignmentforum.org/posts/KzwB4ovzrZ8DYWgpw/more-findings-on-memorization-and-double-descent) and [Lee](https://www.alignmentforum.org/posts/z6QQJbtpkEAX3Aojj/interim-research-report-taking-features-out-of-superposition) have published work that's [significantly contributed to Anthropic's work on dictionary learning](https://transformer-circuits.pub/2023/may-update/index.html). I've also met both Marius and Lee and have confidence in them to do a good job with Apollo.

Additionally, I'm very much a fan of alignment and dangerous capability evals as an area of research and think there's lots of room for more people to work on them.

In terms of cost-effectiveness I like these research areas because they're ones I think are very tractable to approach from outside a major lab in a helpful way, while not taking large amounts of compute. I also think Apollo existing in London will allow them to hire underutilized talent that would have trouble getting a U.S. visa.

Transactions

ForDateTypeAmount
Manifund Bank8 months agoreturn bank funds50000
MATS Program12 months agoproject donation150000
Apollo Research: Scale up interpretability & behavioral model evals researchover 1 year agoproject donation200000
Manifund Bankover 1 year agodeposit+400000