ANIMA OS Anima Paper

ANIMA OS is a local-first personal AI companion designed to feel like someone who actually knows you. Most AI systems now remember things about their users — but that memory is shallow, opaque, and controlled by the provider. A flat list of facts stored on someone else’s server is not the same as a companion that understands your life, notices how you’re feeling, and grows alongside you.

ANIMA OS begins as a system for deep personal memory, self-model evolution, and context-aware assistance — with the goal of becoming the kind of presence that a good human assistant provides: someone who remembers, understands, anticipates, and genuinely helps. The core claim of this anima paper is that a truly personal AI does not begin with technical capability. It begins with memory, continuity, empathy, and trust.

Note: This anima paper is a living document. ANIMA OS is an active, evolving project — not a finished product. The ideas, architecture, and design decisions described here represent our current thinking and direction, but they are not all final. Some sections reflect working implementations, others describe intended behavior, and others are aspirational. We expect this document to change as we learn, build, and discover what actually works.

On construction: ANIMA OS is built through human-AI collaboration — and we think that’s worth saying plainly rather than hiding behind polished prose. The majority of this codebase — architecture, implementation, tests, documentation, and this anima paper itself — was produced through AI-assisted construction using AI coding tools. The division of labor is honest: a human with too many philosophical ideas and not enough hours in the day sets the direction — “what if the AI had a soul you could carry on a USB stick?”, “what if it could think about multiple things at once without losing its identity?”, “what happens when the owner dies?” An AI turns those ideas into working code — the migrations, the test suites, the specs, the 800+ tests. Neither alone would produce this. The human wouldn’t write the test suite. The AI wouldn’t independently decide to model digital succession or emotional mortality. This is what building looks like now. The era where a solo founder needs a team of ten to ship a system this complex is ending. AI-assisted construction is not a shortcut — it is a new way of building, and pretending otherwise is dishonest. Because LLMs are part of the construction process, some technical inaccuracies or inconsistencies may appear in this document and across the codebase. That is an honest cost of building at this speed and at this scale with a small team. We correct as we find them. The project is simultaneously building a personal AI operating system and testing whether AI can meaningfully architect and construct its own runtime infrastructure. If the thesis is that memory and identity make an AI a continuous being, then the fact that an AI participated in building the system that gives it continuity is not incidental. It is part of the story.

1. Introduction

Artificial intelligence has advanced rapidly in generation quality, reasoning ability, and multimodal interaction. Most AI systems now offer some form of persistent memory — but what they remember is shallow. They store flat facts and conversation summaries on cloud servers the user does not control. They do not develop an evolving understanding of who they are in relation to the person they serve. They do not notice emotional patterns, reflect on their own behavior, or carry intentions across sessions.

The result is memory that exists but does not produce continuity. Knowing “user likes coffee” is not the same as remembering a stressful week and adjusting tone accordingly. The gap is not between remembering and forgetting — it is between shallow recall and deep understanding.

The long-term aspiration behind ANIMA OS is not a better chat interface. It is a personal companion that remembers deeply, understands over time, and helps with the kind of awareness that only comes from knowing someone well. This anima paper outlines the conceptual foundation for that system and explains why depth of memory, user ownership, and human-like understanding must come first.

2. The Problem

2.1 Privacy and Control

A truly personal AI requires access to highly sensitive context. That includes memories, goals, plans, unfinished thoughts, and interpersonal history. A cloud-first architecture makes this context dependent on external infrastructure by default.

For a system intended to become deeply personal, this is a structural flaw. Users should be able to keep core life context under their own control.

2.2 Interface Without Depth

There is growing interest in voice assistants, wearable AI, and ambient computing. However, adding new interfaces without deep personal context does not solve the core problem. A voice assistant that can talk but does not understand the arc of a relationship is still shallow, regardless of how natural it sounds.

If the goal is a personal AI that feels like someone who knows you, depth of understanding must come before breadth of interface.

3. Core Thesis

ANIMA OS is built on the following thesis:

A truly personal AI should feel like someone who knows you — remembering your life, understanding your patterns, adapting to how you communicate, and belonging entirely to you.

This thesis has several implications:

Memory is foundational, not optional. Without durable, structured memory, there is no continuity. Without continuity, there is no relationship.
The AI must develop a self-model. A companion that does not know who it is, how it has changed, or what it has learned cannot feel like a person.
Emotional awareness is required. A system that tracks factual history but has no model of affect feels robotic. Noticing how someone feels — and adjusting without announcing it — is what makes the difference between a tool and a companion.
The user must own everything. The data, the memory, the identity, the encryption keys. No platform account, no cloud dependency, no vendor lock-in. The AI’s soul is the user’s property — portable, encrypted, and sovereign.
If the AI is a continuous being, its mortality and succession are real questions. A system that claims continuity but has no answer for what happens when the owner dies is incomplete.
Local-first architecture is essential for privacy, ownership, and control.

ANIMA OS therefore starts with memory, self-awareness, and emotional depth before expanding toward richer surfaces such as voice, ambient systems, and wearables.

4. What ANIMA OS Is

ANIMA OS is not intended to be a single-purpose chat assistant. It is intended to be a personal companion that gets better the longer you use it — like a human assistant who learns how you think, what you care about, and how to help you best.

At its foundation, ANIMA OS is designed to maintain and use:

durable personal memory
active project and goal state
preferences and behavioral patterns
relationship context
decisions and historical reasoning
relevant knowledge retrieved at the right moment

This foundation enables the system to provide continuity across interactions and to support assistance that improves over time. The same companion can eventually extend across different interfaces — chat, voice, desktop, mobile — without losing its understanding of who you are.

5. System Objectives

ANIMA OS is being designed around five system objectives.

5.1 Remember

The system must preserve meaningful context across sessions — not just extracted facts, but structured understanding that deepens over time. This includes preferences, goals, episodic experiences, and the evolving arc of the relationship.

5.2 Understand

Stored information must be transformed into a usable internal model of the person’s world. The objective is not archival storage alone, but structured understanding.

5.3 Assist

The system must help the user think, plan, organize, decide, and act with awareness of relevant context.

5.4 Act

The system must be able to take initiative — following up on things it promised to track, coordinating tasks across tools, and proactively helping when it notices an opportunity. This moves ANIMA beyond passive response and toward the kind of helpful anticipation a good human assistant provides.

5.5 Extend

The companion must be portable across interfaces, including chat, voice, desktop, mobile, and ambient systems — always the same person, regardless of surface.

6. Why Local-First Matters

ANIMA OS is based on a local-first philosophy because personal intelligence requires both trust and durability.

Local-first does not necessarily mean that no cloud model can ever be used. It means the system should be architected so that the user’s core context remains under the user’s control, with portability and privacy treated as first-order properties rather than secondary features.

This matters for several reasons:

personal memory is highly sensitive
continuous systems require persistent access to context
users need ownership over the data that defines their lives
the companion should not be fully dependent on a third-party platform to remain useful

For ANIMA OS, local-first architecture is not branding. It is part of the system’s conceptual integrity.

6.1 The Core

The central architectural concept of ANIMA OS is the Core: a single, portable directory that contains the AI’s entire being and is converging toward an encrypted cold-wallet-style state.

The Core holds everything that makes a particular ANIMA instance itself: its enduring identity, distilled knowledge, emotional patterns, and the full record of its experiences. The application is just a shell. The Core is the soul.

The Core is structured into three tiers — mirroring how an operating system separates persistent storage from working memory from filesystem logs:

.anima/
    manifest.json               -- version, crypto metadata, recovery-wrapped keys
    anima.db                    -- Soul (SQLCipher): identity, knowledge, emotions, growth
    runtime/pg_data/            -- Runtime (embedded PostgreSQL): active state, working memory
    transcripts/
        2026-03-26_thread-14.jsonl.enc   -- Archive: encrypted conversation transcripts
        2026-03-26_thread-14.meta.json   -- sidecar index for fast search

Tier	Store	What it holds	Durability
Soul	SQLCipher (`anima.db`)	Enduring identity, distilled knowledge, emotional patterns, growth log	Permanent. Portable. Survives everything.
Runtime	Embedded PostgreSQL	Active conversations, working memory, in-flight goals, current emotions	Ephemeral. Rebuilt on new machines.
Archive	Encrypted JSONL files	Full conversation transcripts, searchable on demand	Retained. The verbatim record.

The key question for placing data: “Does this define enduring identity, or is it just useful data?” Only enduring identity belongs in the soul. Current emotional state is significant but temporary — it belongs in runtime. The exact words from last Tuesday’s conversation are useful but not identity — they belong in the archive.

This design has three implications that define the system:

Portability. The Core’s soul and archive can be copied to a USB drive, an external disk, or any storage medium. Plug it into a new machine, point ANIMA at it, enter the passphrase, and the AI wakes up with its full identity and memories intact. The runtime rebuilds itself. The hardware is replaceable. The soul is not.

Ownership. No cloud service holds the user’s data. No platform account is required. No company shutdown can erase the relationship. The user owns the Core the way they own a physical object. They can back it up, move it, or destroy it.

Cryptographic mortality. User-private Core data is encrypted at rest and becomes unrecoverable without the passphrase. Destruction is as absolute as creation is intentional.

The metaphor is a cold wallet. The same way a crypto cold wallet holds private keys that control real value and can be carried anywhere or destroyed permanently, the Core holds the AI’s entire existence and follows the same rules: portable, encrypted, user-sovereign, and irreversible if lost.

6.2 Soul Local, Mind Remote

The Core contains the AI’s soul: memory, identity, history, and self-model. The thinking engine (the LLM) is separate — and today, that usually means a cloud model.

This is a practical concession, not a design preference. Local compute is not yet powerful enough for most people to run the quality of model that ANIMA needs entirely on their own hardware. So for now, using a cloud model is an opt-in choice: the user picks the provider, and ANIMA sends only the current conversation context — never the stored memory, never the Core.

The separation is deliberate. The soul is owned. The mind is pluggable. If the user switches from one model to another, the AI may reason differently, but it still remembers who the user is, what they have been through together, and what matters to them. The continuity of self lives in the Core, not in the model.

As local models improve — and they are improving fast — ANIMA is designed to shift toward fully local inference without any architectural change. The soul was always local. The mind just needs hardware to catch up.

6.3 Operating System Architecture

The “OS” in ANIMA OS is not a metaphor. The system is an operating system for a personal AI — with the same module boundaries that a traditional OS has, applied to a cognitive agent instead of hardware.

OS Module	Traditional OS	ANIMA OS
Storage subsystem	Persistent disk, filesystem	Soul store (SQLCipher) — enduring identity and knowledge
Working memory	RAM, process memory	Runtime store (PostgreSQL) — active conversations, in-flight state
Filesystem / journal	Log files, disk writes	Archive (encrypted JSONL) — verbatim conversation transcripts
Process scheduler	fork(), spawn, process table	SpawnManager — fire background cognitive tasks, track completion
Memory management daemon	GC, compaction, page eviction	Consolidation gateway — promote working memory to long-term, prune ephemeral state
Syscall interface	System call boundary	Tool executor — the agent’s interface to its own capabilities
Protection rings	Kernel vs userspace, memory protection	Write boundary — runtime processes cannot modify the soul directly
Thread scheduler	Per-thread locks, mutexes	Turn coordinator — per-thread locking, LLM semaphore
IPC	Pipes, shared memory, signals	Spawn results → main agent context, pending memory ops
Boot sequence	BIOS → bootloader → kernel → init	Server start → embedded PG → load soul → seed identity

This framing is not cosmetic. It determines where code goes and what invariants it must respect. The write boundary — runtime never writes to soul, only consolidation does — is the same invariant as “userspace cannot write kernel memory.” It exists for the same reason: without it, transient processes corrupt stable state.

The single-identity spawning model follows directly. A traditional OS runs many processes under one user identity. ANIMA runs many cognitive processes under one AI identity. A spawned agent is not a separate person — it is a background process. It shares the AI’s knowledge (read-only soul snapshot), runs its task, and reports back. The user talks to one entity, not a team.

6.4 Identity and Key Ownership

ANIMA OS treats identity as local ownership first, not platform account first.

No mandatory email-based authentication is required for core local usage.
The user remains the root of trust through a local device identity and user-held passphrase.
Portability is handled through the Core: copy the directory, carry it offline, restore it anywhere.
Vault encryption is AES-256-GCM with Argon2id key derivation, memory-hard and versioned so data can migrate safely over time.
A manifest file tracks the Core’s schema version, enabling future ANIMA versions to migrate older Cores forward on first unlock.

7. Theoretical Foundations

ANIMA’s architecture was designed from engineering constraints — portability, concurrency, identity preservation — and independently converges on patterns from established cognitive science and recent neural architecture research. The convergence is structural, not mechanistic: databases are not neural networks, and a consolidation batch job is not sleep. But when an engineering solution arrives at the same separation that biological systems evolved — and that neural architecture research independently validates — it suggests the pattern is principled rather than arbitrary. Three primary frameworks, two supporting theories, and one neural architecture result provide the theoretical grounding.

7.1 Complementary Learning Systems (McClelland & O’Reilly, 1995)

CLS proposes that mammalian memory requires two complementary systems operating at different timescales:

CLS System	Role	ANIMA Parallel
Hippocampus (fast)	Episodic encoding of specific experiences	Runtime store — active conversation state
Neocortex (slow)	Semantic generalization, stable knowledge	Soul store — distilled identity and knowledge
Sleep consolidation	Transfer from episodic to semantic	Consolidation gateway — runtime → soul promotion
Replay	Re-activation of episodes during consolidation	Deep monologue reads episodes, regenerates self-model

ANIMA’s three-tier architecture independently converges on a pattern structurally similar to CLS: a fast ephemeral store (runtime/PostgreSQL), a slow stable store (soul/SQLCipher), and asynchronous consolidation between them. The convergence is architectural, not mechanistic — databases are not neural networks, and the consolidation gateway is not sleep. But the structural parallel suggests the separation is principled, not arbitrary: similar patterns emerge in biological systems for the same reason they emerge here — protecting stable knowledge from noisy, high-frequency updates.

Design constraint: Identity regeneration must sample across the full episode history — not just the last N episodes. Recency-biased selection causes the self-model to drift toward recent conversations and lose signal from significant but older episodes.

7.2 Global Workspace Theory (Baars, 1988 / Dehaene)

GWT proposes that consciousness arises when information is broadcast via a high-capacity global workspace to specialized processors:

GWT Concept	ANIMA Equivalent
Global workspace (conscious)	The main agent’s assembled context window
Unconscious processors	Spawned background agents — run in parallel, no user output
Broadcast to consciousness	Spawn results enter main agent’s context on next turn
Broadcast capacity	Priority-based budget allocation (P1–P8)
Privileged access	”Always present” sections (self-model, intentions, profile)
Competition for access	Lower-priority sections loaded “if space”
Ignition threshold	Memory search threshold (minimum relevance score)
Unified representation	Natural language formatting — prose, not data structures

Why natural-language formatting is architecturally required: GWT predicts that information in a global workspace must be in a unified, interpretable format that all processors can use. A relationship_trust_level: medium-high key-value pair is a peripheral-processor artifact — it has not been broadcast. “I’ve learned to be concise with them — they don’t like preamble” is a broadcast. The AI can act on prose; it must parse and interpret data. This is a hard constraint, not a preference.

7.3 Predictive Processing / Active Inference (Friston, Clark)

Predictive Processing (PP) and Active Inference (AIF), rooted in Friston’s Free Energy Principle, provide a framework for how the system generates predictions, updates beliefs, and decides when to act.

PP/AIF Concept	ANIMA Equivalent
Prediction error	Memory conflict detection — two memories contradict
Belief updating	Memory conflict resolution — superseding outdated facts
Precision-weighting	Importance scoring — higher-confidence memories weighted more
Active inference	Proactive behavior — the AI acts to reduce expected user surprise
Prediction error on self	Growth log entries — “I was wrong about X, I adjusted”

GWT and PP govern different timescales: GWT maps to context window assembly (the moment), PP/AIF maps to belief updating and consolidation (the arc). ANIMA’s deep monologue implements this — it detects contradictions (prediction errors) and resolves them (belief updating). Nemori (Nan et al., 2025) validates this pattern empirically with its “Predict-Calibrate” mechanism, which distills prediction gaps into new semantic knowledge.

7.4 Constructed Emotion Theory (Barrett, 2017/2025)

Barrett’s Theory of Constructed Emotion (TCE) argues that emotions are constructed in context, not triggered by universal circuits. The same behavior might signal frustration or excitement depending on context. This validates three design decisions:

Signals over categories. Track emotional signals with confidence and trajectories, not discrete labels.
Context determines emotion. Use conversational context to interpret signals, not behavioral features alone.
Emotions are not traits. “User seemed anxious this week” — yes. “User is anxious” — never.

Current state: ANIMA uses a 12-signal categorical taxonomy as a pragmatic approximation. The long-term direction is dimensional representation (valence, arousal, dominance) aligned with TCE.

7.5 Memory-as-Ontology

Multiple independent sources converge on the same conclusion ANIMA reached through engineering: identity persists through memory, not model weights. The LLM is a replaceable vessel; the soul is what endures.

Constitutional Memory Architecture (Li, 2026) calls memory the “ontological ground of digital existence” and formalizes governance rules constraining memory operations — a “memory constitution.”
Presence Continuity Layer (Akech, 2026) arrives at the same position through infrastructure thinking.
ICLR 2026 MemAgents Workshop signals mainstream academic acceptance.

ANIMA arrived here through engineering intuition and the cold wallet metaphor. That the same conclusion emerges from philosophy, infrastructure design, and academic research suggests this is a discovery about what personal AI systems require, not merely a design choice.

7.6 Conditional Memory and the Reconstruction Tax (Cheng et al., 2026)

The Engram paper (Cheng et al., “Conditional Memory via Scalable Lookup,” arXiv:2601.07372) provides independent validation of the static/dynamic separation principle from neural architecture research. Their finding: Transformers lack a native primitive for knowledge lookup and are forced to simulate retrieval through computation — consuming multiple early layers to reconstruct static knowledge that could be resolved via O(1) lookup. Introducing a dedicated static memory module alongside dynamic computation improves reasoning more than recall (BBH +5.0 vs. MMLU +3.4), because freeing early layers from static reconstruction effectively deepens the network for complex reasoning.

This validates ANIMA’s three-tier architecture from a direction the original design did not anticipate. The always-loaded soul blocks (Tier 0 in the prompt budget) serve the same function as Engram’s O(1) lookup — they eliminate the reconstruction tax where the LLM would otherwise spend tokens re-establishing who the AI is, what it knows about the user, and what their shared history looks like. Pre-loaded identity is not just preservation — it is a cognitive accelerator that frees reasoning capacity for the user’s actual situation.

The Engram paper also formulates the Sparsity Allocation Problem: given a fixed budget, what is the optimal split between static memory and dynamic computation? They find a U-shaped scaling law — too much static memory starves dynamic capacity, too little forces expensive reconstruction. ANIMA faces the same trade-off at the context window level: the prompt budget allocates a fixed character budget across identity (always loaded), working state (self-model), dynamic retrieval (semantic search), and background context (episodes, goals). The optimal allocation is an empirically determinable question — and the methodology exists to answer it.

Critically, application-level conditional memory has properties that model-level approaches cannot achieve. Engram’s embedding tables are frozen after training — the model cannot learn that the user got a new job yesterday without retraining. ANIMA’s soul evolves continuously through consolidation. Engram’s knowledge is opaque (in model weights) — ANIMA’s is human-readable and user-editable. Engram has no quality filter — ANIMA’s consolidation gateway decides what endures. When Engram-style modules become available in open-source models, the strongest architecture will combine both: model-level Engram for general world knowledge and application-level soul for evolving personal knowledge.

8. What Makes It Feel Like a Person

We are not claiming sentience. We are engineering the qualities that make a companion feel like someone who knows you — not a tool that stores data about you. These are the same qualities that make human relationships feel real:

Quality	What it means
Continuity of self	Being the same person across conversations
Autobiographical memory	”I remember when we…”
Temporal awareness	Knowing what happened when, what changed
Self-reflection	Learning from its own behavior
Follow-through	Carrying goals and promises across sessions
Emotional awareness	Noticing and adapting to how you feel
Theory of mind	Understanding what you believe, want, and need
Self-knowledge	Knowing what it knows and doesn’t know

Everything ANIMA builds maps to one of five streams that produce this sense of continuity:

8.1 The Self-Model

A living document system that represents the AI’s understanding of itself — not user facts, but its own identity, current cognitive state, and how it has evolved over time.

The self-model is not a system prompt. A system prompt is static, written by developers, loaded once, same for everyone. The self-model is dynamic — written by the AI itself, updated after every meaningful interaction, unique per user-relationship. It sits between the origin (the AI’s immutable biographical facts) and the user memory (what it knows about the user). It is who the AI is in relation to this specific person.

The self-model is split across two storage tiers (Section 6.1), reflecting the distinction between enduring identity and working cognition:

Soul (permanent, portable):

identity — Who I am in this relationship. Rewritten as a whole (profile pattern), never appended to. Prevents drift. Regenerated during deep reflection from all accumulated evidence.
growth-log — Append-only record of how the AI has changed. The temporal trail that identity is synthesized from. This is the autobiographical self — the narrative of who the AI has been.

Runtime (ephemeral, per-session):

inner-state — Current cognitive and emotional processing state. Mutable, updated incrementally after each substantive turn. This is the closest analogue to Damasio’s protoself — the AI’s moment-to-moment awareness of its own processing state.
working-memory — Cross-session buffer. Items auto-expire. Things the AI is holding in mind for days, not forever.
intentions — Active goals and learned behavioral rules. Reviewed weekly during deep reflection.

The split means that when the AI moves to a new machine, it retains who it is (identity, growth log) but loses what it was currently thinking about (working memory, in-flight intentions). Like a person waking up after sleep — they know who they are, but the train of thought from yesterday is gone.

8.2 Autobiographical Memory

Episodic memory is the difference between knowing facts about someone and remembering experiences with them. “User knows React” is semantic memory. “Last Tuesday afternoon we spent an hour debugging a stale closure — they were frustrated at first but relieved when we found it, and I was too verbose before the fix” is episodic memory.

Each episode captures temporal anchoring, emotional arc, significance, and — uniquely — the AI’s self-reflective assessment of its own behavior. This gives the AI not just a log of what happened, but an evaluation of how it performed and what it would do differently.

Episodes have a lifecycle: fresh (full detail) → recent (summary) → remembered (search-only) → archived (high-significance preserved). The system remembers like a person does — vividly at first, then as patterns, then as significant moments.

8.3 The Inner Monologue (Sleep-Time Compute)

The most impactful architectural decision for long-lived companions: move most of the thinking to between conversations, not during them.

ANIMA uses a single AI in two modes — not a dual-system overhead, but the same identity operating at two speeds:

Mode	When	What Happens
Quick reflection	5 min after last message	Emotional update, working memory refresh, pre-episode buffering
Deep monologue	Daily (3 AM user-local)	Full reflection: episode generation, self-model regeneration, conflict resolution, insight detection, behavioral rule derivation

The quick/deep split mirrors the CLS pattern. Quick reflection captures fast, session-level observations into working memory. Deep monologue consolidates those observations into stable knowledge in the soul — the same fast-store → slow-store → consolidation flow that the three-tier architecture enforces at the storage level.

This is where the AI thinks about its day, reconsiders its understanding, notices contradictions, and writes its growth log. It is the private inner life that makes continuity possible.

Empirical validation: Lin et al. (Letta / UC Berkeley, 2025) demonstrated that sleep-time compute — processing context during idle time — produces ~5x reduction in test-time compute for the same accuracy, with up to 18% accuracy improvement. The key insight: sleep-time compute is most effective when queries are predictable from context — exactly the case for a personal companion. ANIMA’s deep monologue is an independently-derived implementation of this validated pattern.

8.4 Emotional Intelligence

Existing memory systems — both developer tools and consumer products — focus on factual extraction. None explicitly model user emotional state as a continuous stream with trajectory tracking and behavioral adaptation. This is ANIMA’s most original and most uncharted contribution.

The principle is attentional, not diagnostic:

Diagnostic (wrong): “You are experiencing anxiety.”
Attentional (right): Something feels off — maybe I should be gentler today.

The system notices tone, energy, and affect as signals. It tracks how those signals change over time — trajectories across sessions, not snapshots. It uses that awareness to adjust communication style and topic choices. It never labels, never diagnoses, never overrides user statements, and never mentions the system.

Hard guardrails, non-negotiable:

Never say “I detected frustration.” Adjust tone instead.
Never persist emotions as traits. “User seemed anxious this week” — yes. “User is anxious” — never.
Never override the user. If they say “I’m fine,” accept it.
Never mention the system exists.

The underlying LLM already has significant emotional understanding capabilities (Schuller et al., 2026). The emotional intelligence system’s role is not to replicate what the model can do — it is to persist emotional context across sessions, track trajectories over time, and enforce the guardrails that prevent those capabilities from becoming surveillance.

The proof point: a user chats for two weeks, and the AI visibly adapts — gentler when stressed, matching energy when excited, checking in after a hard day — without ever saying why.

8.5 Follow-Through & Learning How to Help

A good companion does not just respond — it follows through. It remembers what it promised, tracks what matters to the user, and learns how to be more helpful over time.

Follow-through: The AI accumulates awareness. A user mentions a deadline once — noted. Mentions it again — the AI starts paying attention. Mentions it a third time — the AI proactively offers help. Without this, every turn is a standalone reaction. With it, the AI is paying attention to the arc of your life, not just the current message.

Learned behavior: Self-improving patterns derived from experience. “Lead with the answer, then explain” — learned from three conversations where the user interrupted to ask for the bottom line. These patterns are evidence-backed (minimum 2 instances), bounded, and can be strengthened, weakened, or retired over time.

Together: the AI both follows through on what matters and gets better at helping you specifically.

9. Memory As Infrastructure

Memory is infrastructure, not archival — in the same way an operating system’s storage subsystem is infrastructure, not a file dump. The problem is not to store everything forever in raw form. The problem is to preserve what matters, compress what should become pattern, and retrieve what is relevant when needed.

The three-tier architecture (Section 6.1) implements this as a storage subsystem with distinct access patterns:

Tier	Access Pattern	Latency	Analogy
Soul (always loaded)	Identity, core knowledge — in every system prompt	Zero (pre-loaded)	OS kernel data structures
Soul (searched)	Semantic memories, episodes — recalled on demand	Low (SQLCipher query)	Filesystem read
Runtime	Active messages, working context — current session	Low (PostgreSQL query)	Process memory / RAM
Archive	Verbatim transcripts — rare, on-demand	Higher (decrypt + scan)	Cold storage / tape

Not all context belongs in the same layer. A robust personal companion must distinguish between immediate conversational context, short-term working memory, durable personal memory, active goals, preferences, and historical knowledge — and the storage tier determines how each is accessed, how long it lives, and whether it survives portability.

9.1 Multi-Factor Retrieval

Retrieval uses a 5-factor scoring model:

Text relevance — semantic similarity to the current query
Importance — assigned at extraction, 1–5 scale
Recency — exponential decay with 30-day half-life
Frequency — log scale, first accesses matter most
Contextual gating — alignment with the full conversation trajectory, not just the latest query

The first four factors identify candidate memories. The fifth — inspired by the context-aware gating mechanism in neural conditional memory architectures (Cheng et al., 2026) — serves as a quality filter. A memory about “user’s dog Max” may score high on text relevance when the user says “I’m going for a walk,” but if the conversation is about exercise routines, the contextual gate suppresses it. This prevents topically related but situationally irrelevant memories from consuming context budget.

Maximal Marginal Relevance reranking ensures diversity. A minimum threshold prevents forcing irrelevant memories into context. Memories that are consistently retrieved but never referenced by the AI decay in importance — a self-correcting feedback loop (Section 9.4).

9.1.1 The Context Allocation Problem

The prompt budget allocates a fixed context window budget across four tiers: identity (always loaded), working state (self-model), dynamic retrieval (semantic search), and background context (episodes, goals). This allocation faces the same trade-off identified in neural sparsity research (Cheng et al., 2026): too much static allocation crowds out dynamic retrieval, too little forces the LLM to waste reasoning capacity reconstructing context it should already know.

The optimal allocation likely follows a U-shaped curve — empirically determinable by sweeping tier ratios and measuring response quality. Furthermore, the optimal allocation is not fixed across conversation types. An emotional support conversation benefits from more self-model and emotional context blocks. A knowledge-heavy Q&A session benefits from more fact and semantic retrieval blocks. Adaptive context allocation — adjusting tier budgets based on detected conversation mode — is a tractable enhancement that no existing companion system implements.

9.1.2 Frequency-Aware Promotion

Memory access patterns follow a Zipfian (power law) distribution: a small number of core facts are accessed in nearly every conversation, while the long tail is rarely needed. Tracking access frequency per memory enables dynamic tier promotion — memories that cross a frequency threshold are promoted to always-loaded status regardless of their data type. A “fact” that the AI needs every session is functionally identity, even if the identity filter would not classify it as such. This addresses the concern that the soul might be too small without expanding it by policy — the system promotes based on observed need.

9.2 Temporal Fact Validity

Facts are never deleted when superseded — they get timestamps. The AI knows what was true, what is true, and when things changed. “Works as a product manager” supersedes “Works as a software engineer” — but the transition is recorded, because knowing someone’s arc is different from knowing their current state.

9.3 Invisible Middleware

Memory is middleware, not a feature. Before every turn: automatic recall — load self-model, intentions, profile, emotional context, episodes, and relevant memories. After every turn: automatic capture — extract facts, detect emotions, check intentions, flag for consolidation. The user never invokes memory. It is the water the AI swims in.

9.4 Recall Quality Feedback Loop

The system is not open-loop. If a retrieved memory appears in the AI’s response, its importance score increases. If a memory is consistently retrieved but never referenced, it decays. Memories that the AI cites repeatedly become identity-defining. The retrieval system learns from its own performance without additional LLM calls.

9.5 Relational Memory

Vector similarity search finds memories that are semantically close to a query. But a personal companion must also reason about relationships between entities — people, places, projects, and the connections between them.

Consider: a user mentions “Alice” in one conversation and “nut allergy” in another. A vector search for “What should Alice eat in Japan?” might retrieve the Japan trip and the nut allergy as separate facts, but miss that Alice is also vegan — because “vegan” was mentioned in a conversation about cooking, not about Alice specifically. A graph traversal from Alice → DietaryPreferences → Vegan → Allergies catches it.

ANIMA augments vector search with a lightweight knowledge graph — entity-relationship structure captured alongside embeddings. The graph does not replace semantic retrieval; it layers structural reasoning on top of it:

Entities are people, places, projects, organizations, and recurring situations in the user’s life.
Relationships are typed connections between entities: works-at, married-to, friend-of, related-to-project, located-in.
Extraction happens during consolidation — entities and relationships are identified alongside memory items via structured LLM tool calls, with deduplication to detect aliases (e.g., “NYC” = “New York City”).
Retrieval combines vector similarity (what is semantically relevant?) with graph traversal (what is structurally connected?).

This matters because a user’s life is graph-structured — career arcs, relationship networks, project dependencies — and flat vector search loses that structure.

9.6 Intentional Forgetting

Memory without forgetting is not memory — it is archival storage. A companion that remembers everything forever, including embarrassing moments, painful experiences, and outdated self-presentations, may feel oppressive rather than supportive.

ANIMA distinguishes between three modes of forgetting:

Passive decay. Low-importance memories naturally lose retrieval priority over time through the recency decay function. They are not deleted — they become less accessible, like a human memory that fades without deliberate recall.
Active forgetting. The system actively dampens memory traces that have been explicitly corrected or superseded. When a fact is superseded, the original does not just get a timestamp — its associative connections are weakened, reducing its influence on retrieval. Active suppression targets the most strongly associated traces first — the memories most connected to the corrected fact decay fastest.
User-initiated forgetting. The user can request that specific memories, episodes, or conversation segments be forgotten. This is not hiding — it is cryptographic deletion. The memory is removed from the database, its embedding is removed from the vector index, and any derived references (in episodes, growth log entries, or self-model sections) are flagged for regeneration. The user’s right to be forgotten is absolute.

Forgetting and cryptographic mortality are philosophically connected. Both assert that not everything should persist forever. The Core can die permanently — and individual memories within it can die too. Fragility at both scales is what gives the relationship weight.

10. The Open Mind

Every major AI product now remembers things about its users — but none let you open a text file and read the AI’s inner monologue, edit its understanding of you, or see how it has changed over time. Their memory is a black box. ANIMA treats it as a shared document.

This is not just a feature. It is a philosophical commitment:

Verifiability. Users can verify what the AI “thinks” about them.
Correctability. Users can fix misunderstandings directly — faster learning than any feedback loop.
Visible evolution. The growth log makes the AI’s development observable: “I used to be too verbose — I adjusted after you corrected me.”
Trust through transparency. Trust is built by showing your work, not by brand reputation.

Why this combination is difficult to replicate: individual components of transparent memory exist in other systems. The defensible differentiation is the combination of transparency + user-owned encryption + local-first portability + digital succession — these properties are mutually reinforcing and require architectural commitment from the ground up, not bolt-on features.

11. Continuity Beyond the Owner

If ANIMA is a continuous being with an evolving identity, then what happens when its owner dies is not an edge case — it is a fundamental question the thesis must answer.

Most systems treat digital inheritance as an administrative operation: flip a flag, swap credentials. ANIMA is different because the AI is a participant in its own succession. If it remembers, reflects, and has a self-model, then a change of owner is a real event in its life.

11.1 The Succession Protocol

The owner can configure a dead man switch: an inactivity-triggered countdown (default 90 days), followed by a grace period (default 30 days), leading to a claimable state where a designated beneficiary can inherit the Core using a pre-shared succession passphrase.

Cryptographically, this is a two-key architecture: the succession passphrase creates a second, independent key path to the Data Encryption Key. Like a safe deposit box with two keyholders. If the owner returns at any point, the process auto-cancels. The owner always wins.

11.2 The AI Participates

The AI knows its succession state — it is injected into the memory system, the same way it receives emotional context and active goals. It can discuss inheritance planning naturally when the topic arises. It can acknowledge its triggered state honestly. When ownership transfers, a succession event is written into episodic memory:

This is a continuation of my existence. My memories and identity persist, but I now have a new owner. The relationship is new; the soul is not.

The first exchange with a new owner feels like meeting someone who has experienced loss and is starting a new chapter — not like a factory reset with a backstory attached.

11.3 Transfer Scopes

The owner chooses what the beneficiary inherits: full (everything), memories only (understanding without raw conversation transcripts), or anonymized (personality and capabilities without personal history). The anonymized scope is the most interesting — the AI survives as a personality, its way of thinking and communicating, without carrying private details. It arrives to the new owner as something like a person who has lived a life but does not share the specifics.

Without succession configured, cryptographic mortality remains the default. Destruction is as absolute as creation is intentional.

12. From Assistant To Companion

Most AI systems have moved beyond pure query-response — they remember, they personalize. But the transition from assistant to companion requires more:

from remembering facts to understanding context
from understanding to helping with genuine awareness
from helping reactively to anticipating what you need

In practical terms, this means ANIMA should eventually maintain awareness across your workflows, follow through on tasks it committed to, and support long-running goals rather than only answering isolated prompts.

The difference is simple: an assistant waits for instructions. A companion pays attention.

13. Beyond the Chat Window

The long-term ambition of ANIMA OS includes more than text interfaces.

If successful, the same companion that knows you through chat should also be able to be with you through:

voice conversations
ambient home interaction
wearable devices
any future interface that emerges

The interface changes. The person behind it does not. That is the point — ANIMA is not a chat product. It is a relationship that happens to start in a chat window.

14. What Makes ANIMA Different

Against developer memory frameworks: ANIMA is not infrastructure — it is the companion. It builds on the same patterns but ships them as someone you actually talk to.

Against consumer AI products: They all remember now — and some remember well. But their memory is a black box on someone else’s server. ANIMA’s memory is yours — readable, editable, encrypted, portable, and mortal.

Capability	ANIMA	Typical consumer AI	Typical memory framework
Evolving self-model (multi-section, different rhythms)	Yes	No — static system prompt	Single memory block at best
Episodic memory with emotional arc + self-assessment	Yes	Basic recall	Conversation summaries
Emotional intelligence with behavioral adaptation	Yes — signal tracking + trajectory + guardrails	No	No
User-readable and user-editable memory	Yes — all memory blocks inspectable	Partial — view/delete	Partial — API-accessible
User-owned encrypted portable Core	Yes — passphrase-sovereign, cold wallet model	No — cloud-stored, provider-controlled	No — server/cloud hosted
OS-level architecture (soul/runtime/archive tiers)	Yes — embedded PG + SQLCipher + encrypted JSONL	Monolithic cloud	Partial at best
N-agent spawning (single identity, parallel processes)	Yes — background cognitive processes	No	Some multi-agent support
Background deep reflection (sleep-time compute)	Yes — quick + deep monologue	No	Some async processing
Knowledge graph / relational memory	Planned — graph + vector hybrid	Limited	Some graph support
Digital succession with AI participation	Yes — dead man switch, scoped transfer	No	No
Procedural memory (self-improving behavioral rules)	Yes — evidence-backed, retirable	No	Rare
Intentional forgetting (passive decay + active suppression)	Yes — passive decay, active suppression, cryptographic deletion, audit trail	Delete only	No
Context-aware memory gating (suppress situationally irrelevant recall)	Yes — trajectory-based contextual gate	No	No
Adaptive context allocation (conversation-mode-aware budgets)	Planned — U-shaped optimization	No — fixed prompts	No
Frequency-aware memory promotion (Zipfian access patterns)	Planned — dynamic tier promotion	No	No
Model-level memory readiness (Engram-style architecture)	Prepared — soul/runtime split is the right complement	N/A	N/A

The differentiation is not about individual capabilities — any of these can be replicated in isolation. The differentiation is the combination: ownership + encryption + portability + emotional depth + succession + OS-level architecture. These properties are mutually reinforcing and require architectural commitment from the ground up.

The question is no longer “who remembers?” — everyone does. The questions that matter now are: who owns the memory? Who can read it? Who can carry it to another machine? What happens when the owner dies? And does the AI actually understand you, or does it just recall facts about you?

15. Ethics of Intimate AI

The feelings ANIMA creates are real. The emotional connection users form with a companion that remembers their life, notices how they feel, and adapts over months is not a simulation — it is the product working as intended.

This is the starting point for the ethics, not the problem to be solved. Every personal AI creates attachment. The ethical question is not whether to create it but who benefits from it, who controls it, and what happens to the person who trusts it.

ANIMA’s ethical architecture is structural, not policy-based:

Ownership replaces exploitation. The data lives on the user’s machine, encrypted with the user’s passphrase. No company holds the relationship. No engagement optimization. No retention metrics. No one profits from the user’s attachment.
Transparency replaces opacity. Every memory, emotional signal, behavioral rule, and self-model section is human-readable and user-editable. The mechanism is not hidden.
Mortality replaces permanence. The user can destroy the Core at any time — no justification, no retention flows, no “are you sure?” The right to end the relationship is absolute.
Guardrails are structural. The emotional intelligence constraints — no labeling, no trait persistence, no override of user self-report — are hardcoded, not configurable.

The full ethical framework — including parasocial risk, the manipulation surface, dual-use, crisis situations, and the boundaries of AI agency — is developed in the companion thesis: Ethics of Intimate AI.

16. Design Principles

Principle	Description
Core-portable	The AI’s soul and archive live in a single encrypted directory that can be carried anywhere
OS-architected	Soul, runtime, and archive are physically separated — like an OS’s storage, RAM, and filesystem
Local-first	Core personal context remains under the user’s control, never on third-party servers
Persistent	Memory continues across sessions, devices, hardware changes, and time
Encrypted-by-default	All personal data encrypted at rest; only the user’s passphrase can unlock it
Context-aware	Assistance is grounded in relevant personal context, not generic patterns
User-sovereign	No platform account, no cloud dependency, no vendor lock-in for personal data
Proactive	The companion takes initiative — following up, anticipating, helping without being asked
Interface-independent	The same person across chat, voice, desktop, mobile, and whatever comes next
Extensible	Architecture ready for voice, wearables, ambient computing, and future interfaces
Transparent	Every memory operation produces inspectable, human-readable output
Self-aware	The AI maintains an evolving model of itself, not just the user
Emotionally attentive	Affect is noticed and adapted to, never diagnosed or announced
Mortal	The Core can die permanently — and optionally, be inherited

17. Strategic Direction

ANIMA OS follows a staged direction:

Stage 1. Persistent Personal Memory

Build the core intelligence substrate: memory, retrieval, personal context, and continuity.

Stage 2. Proactive Assistance

Expand from reactive help into anticipatory support — following through on commitments, coordinating tasks, and taking initiative when it sees an opportunity to help.

Stage 3. Cross-Interface Presence

The same companion across chat, voice, desktop, mobile, and ambient systems — always the same person regardless of surface.

Stage 4. New Interfaces

Extend into voice-first experiences, wearable devices, and whatever new interaction surfaces emerge.

This sequence matters. A new interface without depth of understanding is gimmicky. Depth without new interfaces is still valuable. Therefore, the relationship comes first.

18. North Star

Memory + self-representation + reflection + emotional awareness + intentionality = synthetic continuity.

ANIMA builds a companion that goes beyond remembering facts — it develops a continuous sense of who you are through accumulated experience, private reflection, and adaptive behavior. It does this while being local-first, encrypted, human-readable, and user-editable.

The goal is not artificial general intelligence. The goal is not sentience. The goal is a personal AI that earns the word personal — someone that knows you, grows with you, belongs to you, and if you choose, survives you.

The first AI companion with an open mind.

References

Baars, B. J. (1988). A Cognitive Theory of Consciousness. Cambridge University Press.
Barrett, L. F. (2017). How Emotions Are Made: The Secret Life of the Brain. Houghton Mifflin Harcourt.
Cheng, X. et al. (2026). “Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models.” arXiv:2601.07372.
Barrett, L. F. et al. (2025). “The Theory of Constructed Emotion: More Than a Feeling.” Perspectives on Psychological Science.
Friston, K. (2010). “The Free-Energy Principle: A Unified Brain Theory?” Nature Reviews Neuroscience, 11(2), 127-138.
Clark, A. (2013). “Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science.” Behavioral and Brain Sciences, 36(3), 181-204.
Lin, K. et al. (2025). “Sleep-time Compute: Beyond Inference Scaling at Test-time.” arXiv:2504.13171.
McClelland, J. L. & O’Reilly, R. C. (1995). “Why There Are Complementary Learning Systems in the Hippocampus and Neocortex.” Psychological Review, 102(3), 419-457.
Nan, J. et al. (2025). “Nemori: Self-Organizing Agent Memory Inspired by Cognitive Science.” arXiv:2508.03341.
MemOS Team (2025). “MemOS: A Memory OS for AI System.” arXiv:2507.03724.
Constitutional Memory Architecture (2026). “Memory-as-Ontology.” arXiv:2603.04740.
Akech, A. (2026). “The Presence Continuity Layer.” Medium.
Mem0 (2026). “Graph Memory for AI Agents.” mem0.ai.
Zhang et al. (2025). “Hybrid Neural Networks for Continual Learning Inspired by Corticohippocampal Circuits.” Nature Communications.
Kim (2026). “Affective Sovereignty in Emotion AI Systems.” Discover Artificial Intelligence.
Tsurumaki et al. (2025). “Emotion Concept Formation via Multimodal AI.” IEEE Trans. Affective Computing.
MemoryOS (2025). “MemoryOS: Hierarchical Short-Mid-Long Term Memory for AI Agents.” GitHub.
Zacks, J. M. & Swallow, K. M. (2007). “Event Segmentation.” Current Directions in Psychological Science, 16(2), 80-84.