Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
Mtcp avatarMtcp avatar
Ahmad Abby

@Mtcp

Independent AI safety researcher. Built MTCP and ARCS the only published empirical infrastructure measuring whether AI systems hold their constraints under real operating conditions.

https://mtcp.live
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

Most enterprises cannot prove control over the AI systems influencing their most important decisions. Governance frameworks define intent. They do not demonstrate control.

I built the independent assurance infrastructure that closes that gap.

183,924 evaluations across 32 models from 13 providers. 33 published papers across two citable DOIs. The headline finding: every model tested accepted false authority claims at every temperature. Architectural. Temperature invariant. No governance framework resolves it.

Independently corroborated by four external sources including a NeurIPS 2026 geometric prediction from Dynamis Labs without prior knowledge of the dataset.

The research has direct implications for EU AI Act compliance, sovereign AI deployment, and enterprise AI governance at board level.

Published under DOI: 10.17605/OSF.IO/DXGK5 and DOI: 10.5281/zenodo.20386024

Projects

MTCP The Independent Assurance Layer for Enterprise AI Control

pending admin approval

Comments

MTCP The Independent Assurance Layer for Enterprise AI Control
Mtcp avatar

Ahmad Abby

about 2 hours ago

@Jesse-Richardson this sits directly underneath the policy and preparedness work you have been funding.

183,924 evaluations across 32 models from 13 providers. The finding that matters for AGI preparedness. Every model tested accepted false authority claims at every temperature. Architectural. Temperature invariant. No intervention resolves it.

This is not a benchmark. It is the only published independent empirical dataset measuring whether AI models hold their governance constraints under real operating conditions. The infrastructure that produces the empirical evidence the policy arguments need behind them.

Policy without evidence of model behaviour is argument. Policy with a citable published dataset is defensible.

Minimum ask is $8,000. Six months of API costs and infrastructure to keep the evaluation cadence running and complete the formal ARCS publication.

Happy to answer any questions directly.