Research regarding technical/mathematical aspects of alignment, primarily using the infrabayesianism framework.
Provide a readable redistillation (or two) of the infrabayesian framework. A first draft of this already exists, but more work is needed to turn this from a minimum viable proof of concept (which a few people have already used to great effect) into something clearer and more approachable. Funding would allow me time to write and edit and would buy you a readable writeup of a notoriously tangled (but useful!) framework.
Establish a major result/definition for the ALTER Prize, probably within IB physicalism or logic. Funding would give me time to fully engage with underexplored directions in IB and would buy you a math PhD's expertise and focus.
(Stretch goal) If you grant me enough funding, a research collaborator would be able to produce a companion writeup to mine targeted to a less technical (with respect to math) audience, where mine will not shy away from the necessary mathematical machinery. Sufficient funding would let me bring that collaborator in fully and pay for their time as well.
(Stretch goal) If you grant me enough funding, that same collaborator and I could work togetherfor long enough to create/train a proof of concept RL agent that wins UDT puzzles like Troll Bridge or Perfect Transparent Newcomb.
Primarily as a research stipend/living expenses for me (and possibly also a research colleague I'm already working well with), but also partially as living expenses/needs for my grandmother, who lives with me.
If funding permits, I'd also bring in Charles Wittel, who may also be applying for funding here.
I have earned a PhD in pure math - thesis preprint here. That research went decently well, especially conducted as it was in the middle of a pandemic; apart from that, no particular track record yet.
Likely causes: burnout, personal/familial injury or illness, I turn out to be much worse at math-adjacent research than at pure math research
Likely outcomes: not getting very much done on the writing or research
Likely causes: IB turns out to be the wrong framework
Likely outcomes: tossing out the entire plan and going with a different framework (like finite-factored sets, maybe?)
None so far.