You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
Advancing Scalable Oversight, by finding the best ways to use the complementary strengths of humans and AI to identify harm in conversations. Funding for the Spring 2026 SPAR project https://sparai.org/projects/sp26/recu4ePI8o6thONSs.
Datasets: Make a dataset of tasks that represent the task of identifying harm in AI-conversation in a wide range of domains (health, computer-control, etc.), where baseline Humans and AIs get <70% accuracy. We will collect these datasets from existing literature (building from our work in the Fall), and creating new datasets via techniques like tampers.
Methods: Develop methods for Complementarity on those datasets. We will build and improve our confidence-calculation, hybridization, and sub-task-delegation methods we developed in the Fall, and introduce new assistance methods.
Platform: Develop a Human Rating platform that fixes a lot of the issues in existing Human Rating platforms used in academia. This is worth the time investment for our project-alone, and we also aim to allow all other projects collecting Human Ratings to use this for free, making it easier to incorporate humans in the loop.
$20k from SPAR for the Spring 206 round. $3.5k of that will go to inference costs, $500 will go to platform hosting costs, and $16k will go to human experiment costs.
For the previous Fall 2025 project, we received $16k from SPAR, and $2k from me (Rishub). 90% of these costs went to human experiments with Prolific, and 10% went to inference costs.
Over the next 2 months remaining of the SPAR project (+1-3 months of potential wrap-up work), the funding will be used for:
$6k for inference costs, to try larger models and more confidence-calculation techniques
$15k for human experiment costs
$5k for fine-tuning experiments on improving confidence-calculation
$4k for Claude Max (5x) for 2 months, for the ~half of mentees that don't have it yet.
We have a team of talented 25 advisors and mentees. I’m leading it, and have worked at GDM for ~7 years, spending 2 years on Scalable Oversight (paper), and the other 5 on other high-impact projects like AlphaFold 2 and 3.