Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
Nick-is-building- avatarNick-is-building- avatar
Nick Wagner

@Nick-is-building-

Independent AI safety researcher. Built ast-guard, a deterministic reward hacking detector, and ran empirical RL training experiments on strategy migration under non-differentiable selection pressure. Self-taught, self-funded, based in Berlin.

https://github.com/Nick-is-building/ast-guard
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

I came to AI safety through intrinsic curiosity, not a formal CS or ML pipeline. Over the past six months I taught myself machine learning, reinforcement learning, and code analysis — then built ast-guard, a zero-dependency AST analyzer that detects structural reward hacking in LLM-generated code. I integrated it into a GRPO training loop and ran three A100 experiments that produced what I believe is the first empirical observation of gradual strategy migration in a reward-hacking model under deterministic selection pressure.

I have no prior publications and no institutional affiliation. This project is entirely solo and self-funded. I'm looking for mentors, collaborators, and a co-author to help turn the existing empirical results into a formal paper. I'm especially interested in connecting with people working on reward hacking detection, RL training interventions, or CoT monitoring.

Projects

ast-guard: Deterministic Reward Hacking Detection - Paper & Control Experiments

pending admin approval