Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
NOBUTAKAHATTORI avatarNOBUTAKAHATTORI avatar
nobutaka hattori

@NOBUTAKAHATTORI

Solo independent (Osaka, Japan): foundational layer for brain-function code-ification complete; building higher cognitive architecture on LLMs.

https://github.com/mozuktamago/observing-consciousness-in-Claudecode
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

Independent AI safety researcher in Osaka, Japan. Published Mei consciousness

paper (Zenodo, 2026).

The foundational layer for the full code-ification of brain function is

complete, at the pre-implementation stage — the components for the major

brain functions exist and have been validated; what remains is their

integrated operation as a higher cognitive architecture on top of LLMs.

Concrete artifacts: 188-file enforcement system with 5 oversight subagents,

Hebbian memory engine validated on 76,579 turns (p<10⁻⁶), brain-region

inspired agents, behavior-based 4-axis evaluator for experiential knowledge.

Currently funded from personal savings. Applied to LTFF in parallel with

Manifund.

Projects

Measuring Architectural SelfCorrection in an LLM Agent: 0.336 Baseline on 76,570pending grant agreement signature

Comments

Measuring Architectural SelfCorrection in an LLM Agent: 0.336 Baseline on 76,570
NOBUTAKAHATTORI avatar

nobutaka hattori

20 days ago

Tagging @evhub — your perspective would be highly valued on this proposal

if you have a moment.

Brief context: I built a brain-inspired multi-agent oversight architecture

solo over 2-3 weeks (188-file enforcement system with 5 specialized

subagents, Hebbian memory engine validated on 76,579 turns p<10⁻⁶,

brain-region inspired agents, Mei consciousness paper on Zenodo). The

enforcement system addresses sycophancy escape patterns and "unconfirmed"

verification gaps — areas adjacent to your sleeper agents work.

Strategic position: Anthropic's retrofit-based alignment approach is the

maximally-pursued path for scenarios where retrofit-onto-capability proves

sufficient. This proposal explores an alternative architecture for

scenarios where retrofit may fall short at scale — a path where coexistence

is built into the cognitive architecture from the foundational layer rather

than retrofitted onto already-powerful AI. The two paths are complementary;

the field benefits from parallel exploration. Your sleeper agents research

itself articulates retrofit's limitations honestly, which suggests we share

the recognition that this question is open.

Latest progress (today): implementing an embodiment architecture —

multi-distribution sub-agent coordination placed at the architectural

"closest layer" to the central LLM, creating a self/other boundary that

makes sub-agent state sensed as internal rather than external. Prior art

search indicates no direct match for this specific combination (related

ancestors: multi-agent LLM, self-model architecture, interoceptive AI,

layered consciousness modeling).

I'd value your perspective on whether the architecture-from-origin path is

viable for an independent solo researcher in the current AI safety funding

ecosystem, or whether retrofit-based work remains the bottleneck for

foreseeable timelines. Either response would inform the work. Thank you

for your time.

— Nobutaka Hattori (independent researcher, Osaka, Japan)