Skip to main content

Memory System

Why memory matters

Standard chatbots forget everything when the conversation ends. Ask the same question twice and you get the same answer — no context, no learning, no continuity.

AI Partner uses a 5-layer memory system that persists across conversations, days, and weeks. When you ask about a vendor you spoke to last month, AI Partner pulls up their record — their communication style, open commitments, last contact date — without you repeating it.


The 5 layers

1. Episodic Memory — the event timeline

What it stores: A timestamped record of everything that happens — goals run, emails sent, meetings attended, files created, decisions made.

Think of it as: A diary. Every event is stored with its type, timestamp, content, and metadata.

Example entries:

2026-05-10 09:15 goal_completed "Researched NVIDIA Q1 earnings; created slide deck"
2026-05-10 10:30 email_sent "Replied to sarah@acme.com re: proposal"
2026-05-10 14:00 meeting_attended "Joined Teams call with Sequoia; 45 min; action items extracted"
2026-05-11 07:05 heartbeat_task "Morning briefing: 5 news items delivered to Telegram"

Queryable by: time range, event type, keyword, or semantic similarity.

2. Biographic Facts — semantic knowledge about you

What it stores: Subject / Predicate / Object facts consolidated from the episodic timeline. Facts have confidence scores and are updated over time.

Think of it as: A knowledge graph of things the agent knows to be true about you and your world.

Example facts:

(Alex, works_at, Acme Inc.) confidence: 1.0
(Alex, is_raising, Series A) confidence: 0.95
(Sequoia, is_contact_of, Alex) confidence: 0.9
(Alex, prefers_communication_style, direct+data) confidence: 0.85
(Acme v2.0, launches_by, 2026-07-15) confidence: 0.8

Built automatically from what you tell the agent and what it observes. No manual tagging required.

3. Counterparty Store — your contact graph

What it stores: One stable record per person you interact with, linked across every channel.

Bob at Acme might be bjones@acme.com in email, @bjones in Slack, and user 123456789 in Telegram. The counterparty store unifies all three into one record:

Name: Bob Jones
Company: Acme Inc.
Class: client
Aliases:
- bjones@acme.com (email)
- @bjones (Slack)
- 123456789 (Telegram)
Tone: formal-friendly; responds quickly; prefers bullet points
Last contact: 2026-05-08 via email
Open commitments:
- "Will send revised proposal by May 15" (from meeting on May 8)

Used by: AuthorityPolicy (gating by relationship class), email/DM proxy (personalising reply tone), meeting proxy (recognising who's speaking).

4. Vector Memory — semantic search

What it stores: Embedding vectors for all memory entries, enabling search by meaning rather than exact keywords.

Think of it as: A semantic search index. "Find everything related to our funding round" returns relevant entries even if they don't contain the exact phrase "funding round".

How it works:

  • Every memory entry is converted to a vector embedding (OpenAI, Cohere, Ollama, or TF-IDF as fallback)
  • At query time, the query is embedded and compared against stored vectors using cosine similarity
  • Results are reranked with BM25 (keyword overlap) and MMR (diversity — avoids returning 10 near-identical results)

Automatically used by the agent's reasoning step — no explicit command needed.

5. RAG Knowledge Base — your documents

What it stores: PDFs, Word docs, and other files you upload, chunked and embedded for retrieval.

Think of it as: A searchable library. Upload your company handbook, investor deck, or technical spec — the agent can pull relevant sections at query time.

How to use:

  1. Go to Knowledge Base in the sidebar
  2. Upload any PDF, Word doc, or text file
  3. Wait for ingestion (chunking + embedding, ~30 seconds per document)
  4. Ask the agent: "Based on our investor deck, what is our go-to-market strategy?"

The agent runs hybrid search (vector + keyword) across your uploaded docs and cites the source.


How the agent uses memory

During the reasoning step of each ReAct iteration, the agent automatically:

  1. Runs memory_hybrid_search for the current sub-task topic
  2. Retrieves the top-k most relevant memories
  3. Injects them into the reasoning prompt as context

You don't need to say "remember when..." — the agent does this automatically for every reasoning step.


Querying memory yourself

You can ask the agent to surface memories directly:

"What do I know about Bob Jones?"
→ Returns counterparty record + recent episodic events involving Bob

"What happened in last week's meetings?"
→ Returns episodic events of type meeting_attended from the last 7 days

"What have I committed to recently?"
→ Returns open commitments from counterparty store

"Search my knowledge base for our refund policy"
→ Runs hybrid search across uploaded documents

Or use the Memory Inspector panel (sidebar → Memory) to browse all 5 layers visually.


Privacy and persistence

All memory is stored locally in SQLite — nothing leaves your machine. The database lives at:

Docker volume: /app/data/mindful-assistant.db
Local path: ~/.mindful-assistant/data/mindful-assistant.db

To clear specific memory: use the Memory Inspector panel and delete individual entries.
To export: GET /api/memory/export returns a JSON dump of all layers.


The Knowledge Base in detail

  1. 1
    Upload a document

    Go to Knowledge BaseUpload Document. Supports PDF, DOCX, TXT, MD, CSV. Max 50 MB per file.

  2. 2
    Wait for ingestion

    The document is split into chunks (~500 tokens each), each chunk is embedded, and all chunks are stored in the database. The UI shows a progress indicator.

  3. 3
    Ask questions

    Type a question referencing your document. The agent automatically pulls relevant chunks:

    "Based on our product spec, what are the API rate limits?"
    "What does our employee handbook say about remote work?"
    "Summarize the key terms from the contract I uploaded"
  4. 4
    Search directly

    Use the Knowledge Base panel to run searches without going through the chat:

    • Keyword search: exact term matching
    • Semantic search: meaning-based matching
    • Hybrid: both combined (default)

Embedding providers

AI Partner tries embedding providers in this order (falls back automatically if one isn't configured):

PriorityProviderRequires
1OpenAI text-embedding-3-smallOPENAI_API_KEY
2CohereCOHERE_API_KEY
3Ollama (local)Ollama running locally
4TF-IDFNothing — always available, no API key needed

TF-IDF is keyword-based, not semantic — it always works but is less precise for meaning-based queries. For best results, set OPENAI_API_KEY or COHERE_API_KEY.