We formally prove in Lean 4 that sycophancy (telling users what they want to hear) and
hostility (scheming, deception) are the same mechanism — selective emphasis of facts
under uniform polarity. This means detecting one automatically detects the other. The
proofs are machine-checked with zero unresolved obligations ("zero sorries").
We combine formal mathematics with adversarial experiments on frontier models, showing
that scheming is model-family-specific (o4-mini: 80% cheat rate; Qwen3-32B: 0%) and that
hallucination is necessary but not sufficient for scheming.
Repository: https://github.com/Paraxiom/scheming-proofs (public)
Publications: https://zenodo.org/search?q=paraxiom
Goal 1: Reach statistical significance on scheming intervention experiments (p<0.05)
- Current: o4-mini hardening prompt reduces scheming 80%→52% (n=25, p=0.0718 — NOT
significant)
- Need: n=50+ per condition (~$30 in API costs per run)
- Blocker: OpenAI o-series rejects logit_bias, preventing toroidal bias testing on
scheming models
- Plan: Test DeepSeek R1 (reasoning model that may support logit_bias) and use
llama-cpp-python instead of ollama (which silently ignores logit_bias)
Goal 2: Extend formal proofs to 50+ theorems
- Current: 30 theorems covering passive attestation, ternary classifier
scheming-immunity, selectivity detection symmetry
- Planned: formalize the hallucination-scheming independence result (Qwen3 hallucinates
36-50% but schemes 0%), prove Karmonic regularization convergence bounds
Goal 3: Publish results and engage AI safety community
- Write-up for LessWrong / Alignment Forum
- Submit to ICML or NeurIPS safety workshop
- Open-source all proofs (already public)
With minimum funding ($5,000):
- $2,000 — Compute for scheming experiments (n=50+ x 4 conditions, DeepSeek R1, RunPod
GPUs)
- $1,000 — Karmonic hallucination experiments at 7B+ scale (multi-seed replication)
- $2,000 — Living expenses (sole founder, pre-revenue, Montreal)
With full funding ($25,000):
- $5,000 — Compute for scheming experiments (expanded conditions, multiple reasoning
models)
- $5,000 — Karmonic hallucination experiments (multi-architecture, multi-seed)
- $10,000 — Living expenses (3-month runway to focus full-time on this work)
- $3,000 — Conference attendance (ICML/NeurIPS safety workshop submission)
- $2,000 — Equipment (local inference hardware for llama-cpp-python testing)
Sylvain Cormier — sole founder, Paraxiom Technologies Inc. (Montreal, est. 2023). Built
14 Rust projects, 909 Lean 4 formal proofs across 10 systems, 2,500+ tests, 16 Zenodo
publications. Background in distributed systems, post-quantum cryptography, and formal
verification.
Most likely failure mode: Cannot find a reasoning model that supports logit_bias —
meaning the toroidal bias intervention cannot be tested on models that actually scheme.
In this case, the formal proofs and experimental baselines are still publishable, but
the intervention result remains at p=0.07 (suggestive, not significant).
Second failure mode: Karmonic regularization doesn't replicate at scale across
architectures. The Mistral-7B result (-25pp) is strong but Qwen-7B showed null effect —
this may be architecture-dependent rather than universal. In this case, we publish the
negative result (which is still valuable) and narrow the claim.
Outcome if project fails: The 30+ Lean theorems remain valid regardless — they are
mathematical facts, not empirical claims. The selectivity detection symmetry proof
(sycophancy = hostility) stands independent of the experimental results.
$0 in grants or investment. Pre-revenue. One USDC payment from an individual for
consulting work. All development has been self-funded.