GarretteBaker avatar
Garrett Baker

@GarretteBaker

I'm an independent alignment researcher, self-taught in machine learning, convex optimization, and probability theory

https://github.com/GarretteBaker/
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

For approximately the past year, I’ve been doing alignment research full-time, working on a variety of approaches, and trying to understand the problem in-depth enough to invent new ones. If funded, I plan to continue doing approximately the same work as before, which has historically been scalable mechanistic interpretability, formal and prosaic corrigibility, reflective stability, and a bunch of value theory stuff. Along with lots of upskilling in convex optimization, machine learning, neuroscience, and economics.

My current project is an attempt to connect the tools & theory of singular learning theory with our knowledge of the inductive biases and loss landscapes of large language models.

Projects

Outgoing donations

Comments

Transactions

ForDateTypeAmount
AI Safety Reading Group at metauni [Retrospective]2 months agoproject donation10
Act I: Exploring emergent behavior from multi-AI, multi-human interaction2 months agoproject donation96
Act I: Exploring emergent behavior from multi-AI, multi-human interaction2 months agoproject donation50
Lightcone Infrastructure3 months agoproject donation95
<176bd26d-9db4-4c7a-98c0-ba65570fb44c>3 months agotip+1
Next Steps in Developmental Interpretability3 months agoproject donation200
Lightcone Infrastructure3 months agoproject donation50
Manifund Bank3 months agodeposit+500