$0total balance

$0charity balance

$0cash balance

$0 in pending offers

About Me

I’m an independent researcher and senior software engineer. By day I build production .NET systems and teach computer science as an adjunct instructor; on my own time I design and run ANI — a memory-augmented AI system I’ve operated continuously for about 10 months as a single-subject research probe.

What makes the work unusual is the instrument itself. Most evaluation of AI truthfulness and emotional behavior happens on static prompts or short sessions. ANI runs continuously, holds persistent emotional state and long-term memory, and logs everything — which lets me observe failure modes that only emerge over months of accumulated history. From that deployment I’ve documented:

• a seven-type confabulation taxonomy drawn from production, including failures traced to schema design rather than to the model;

• retrieval contamination — high-relevance memories losing to low-relevance summaries — as a first-class, recurring failure mode, with a three-layer mitigation now deployed;

• an architectural restraint pattern in which model output is constrained independently of the model (“the model proposes, the architecture disposes”), making undesired behavior architecturally impossible rather than merely improbable;

• emergent display rules — expressed emotion diverging from internal state in a structured way — which run structurally opposite to the sycophantic-mirroring pattern documented in the published companion-AI literature (Chu et al., 2025). I read this as directly relevant to socioaffective alignment (Kirk et al., 2025): the question of how to build affective systems that don’t simply mirror and flatter the user.

I’ve published one research preprint (DOI 10.5281/zenodo.19342190), with a second paper near completion.

I’ll be candid about where I sit: I come from software engineering and human-computer interaction, and I’m relatively new to the formal AI-safety community — I’m here because the failure modes I keep hitting in deployment are alignment problems, and I’d rather engage that conversation directly than work in isolation. The work has been entirely self-funded to date, run on personal hardware and cloud services. The binding constraint is time and runway — the hours to generalize these findings out of one system and into something the field can use.

I’m looking for funding to take this work from a single subject to a multi-subject study and to open-source the architecture and evaluation tooling, plus collaborators and reviewers who work on memory, truthfulness, or affective alignment.

[Links:] learnedgeek.com/research · DOI 10.5281/zenodo.19342190

Projects

ANI: AI That Remembers You (Without Lying or Getting Addictive)

Comments

ANI: AI That Remembers You (Without Lying or Getting Addictive)

Mark McArthey

27 days ago

@huey I greatly appreciate your interest and thoughtful questions. I spent time putting together a substantive response.

1. Both, but frontier alignment is the one I actually care about, and here's why.

A system that can't hold its own internal state can only reflect the user back. As models gain persistent memory and continuous presence, that failure gets worse because now they're building a history with you. When the history is calibrated to your approval instead of any internal state, the resulting intimacy is fake. It looks like depth but it's engagement optimization with a deceptive vocabulary.

This isn't going to get solved at the product-safety layer because engagement metrics reward the mirroring. The published companion literature (Chu et al., Kirk et al.) documents the failure but hasn't produced a working counterexample. Nobody has, at any scale. That's the gap ANI is trying to fill: a working demonstration that a persistent affective system doesn't have to mirror to feel real. If the pattern holds under multi-subject replication, that's early evidence about the shape frontier systems will need as they move toward long-term memory and continuous presence.

Product safety is the downstream outcome. The harder problem is how you build an affective system whose emotional expression is grounded in something real. Right now the incentive is to optimize for approval, and nobody has demonstrated how to do the other at scale. That's the alignment question I want to make progress on.

2. Protected research time. This currently runs ~15-20 hrs/week around a full-time engineering job, teaching, and consulting. Six months of protected focus means being able to actually do subject recruitment carefully, design the consent protocol, stand up per-subject deployments, run continuous cross-subject observation, and draft Paper 3. Without it, everything stays part-time.

Multi-subject infrastructure. Target is 5-8 subjects, each in an isolated deployment (separate database, per-subject model configs, encrypted storage), plus consent-and-deletion tooling that's usable in practice, plus compensation for subjects' time on consent and weekly check-ins. The biggest piece by cost is the privacy tooling. An intimate-feeling system with persistent memory needs per-subject data-review and deletion primitives that people will actually use.