Manifund

I understand where you're coming from, but even alignment papers have to start from existing knowledge. To write a mechanistic interpretability paper needs mastery of certain aspects of linear algebra, Python, and some proficiency in avoiding anthropocentric explanations for observed phenomena. Singular learning theory needs algebraic geometry and statistical learning theory. Agent foundations require non-standard decision theory and applications of logic and probability theory.

I expect these prerequisites to not only deepen over time as alignment becomes more and more paradigmatic but also to lead to new, distinct techniques that may be hard to learn without having to be personally mentored by a researcher in Berkeley. And researcher time is valuable, so the sooner we can get the knowledge transfer problem off their hands, the better.

Projects

Comments