Memory System

Why memory matters

Standard chatbots forget everything when the conversation ends. Ask the same question twice and you get the same answer — no context, no learning, no continuity.

AI Partner uses a 5-layer memory system that persists across conversations, days, and weeks. When you ask about a vendor you spoke to last month, AI Partner pulls up their record — their communication style, open commitments, last contact date — without you repeating it.

The 5 layers

1. Episodic Memory — the event timeline

What it stores: A timestamped record of everything that happens — goals run, emails sent, meetings attended, files created, decisions made.

Think of it as: A diary. Every event is stored with its type, timestamp, content, and metadata.

Example entries:

2026-05-10 09:15  goal_completed     "Researched NVIDIA Q1 earnings; created slide deck"
2026-05-10 10:30  email_sent         "Replied to sarah@acme.com re: proposal"
2026-05-10 14:00  meeting_attended   "Joined Teams call with Sequoia; 45 min; action items extracted"
2026-05-11 07:05  heartbeat_task     "Morning briefing: 5 news items delivered to Telegram"

Queryable by: time range, event type, keyword, or semantic similarity.

2. Biographic Facts — semantic knowledge about you

What it stores: Subject / Predicate / Object facts consolidated from the episodic timeline. Facts have confidence scores and are updated over time.

Think of it as: A knowledge graph of things the agent knows to be true about you and your world.

Example facts:

(Alex, works_at, Acme Inc.)                         confidence: 1.0
(Alex, is_raising, Series A)                         confidence: 0.95
(Sequoia, is_contact_of, Alex)                       confidence: 0.9
(Alex, prefers_communication_style, direct+data)     confidence: 0.85
(Acme v2.0, launches_by, 2026-07-15)                 confidence: 0.8

Built automatically from what you tell the agent and what it observes. No manual tagging required.

3. Counterparty Store — your contact graph

What it stores: One stable record per person you interact with, linked across every channel.

Bob at Acme might be bjones@acme.com in email, @bjones in Slack, and user 123456789 in Telegram. The counterparty store unifies all three into one record:

Name: Bob Jones
Company: Acme Inc.
Class: client
Aliases:
  - bjones@acme.com (email)
  - @bjones (Slack)
  - 123456789 (Telegram)
Tone: formal-friendly; responds quickly; prefers bullet points
Last contact: 2026-05-08 via email
Open commitments:
  - "Will send revised proposal by May 15" (from meeting on May 8)

Used by: AuthorityPolicy (gating by relationship class), email/DM proxy (personalising reply tone), meeting proxy (recognising who's speaking).

4. Vector Memory — semantic search

What it stores: Embedding vectors for all memory entries, enabling search by meaning rather than exact keywords.

Think of it as: A semantic search index. "Find everything related to our funding round" returns relevant entries even if they don't contain the exact phrase "funding round".

How it works:

Every memory entry is converted to a vector embedding (OpenAI, Cohere, Ollama, or TF-IDF as fallback)
At query time, the query is embedded and compared against stored vectors using cosine similarity
Results are reranked with BM25 (keyword overlap) and MMR (diversity — avoids returning 10 near-identical results)

Automatically used by the agent's reasoning step — no explicit command needed.

5. RAG Knowledge Base — your documents

What it stores: PDFs, Word docs, and other files you upload, chunked and embedded for retrieval.

Think of it as: A searchable library. Upload your company handbook, investor deck, or technical spec — the agent can pull relevant sections at query time.

How to use:

Go to Knowledge Base in the sidebar
Upload any PDF, Word doc, or text file
Wait for ingestion (chunking + embedding, ~30 seconds per document)
Ask the agent: "Based on our investor deck, what is our go-to-market strategy?"

The agent runs hybrid search (vector + keyword) across your uploaded docs and cites the source.

How the agent uses memory

During the reasoning step of each ReAct iteration, the agent automatically:

Runs memory_hybrid_search for the current sub-task topic
Retrieves the top-k most relevant memories
Injects them into the reasoning prompt as context

You don't need to say "remember when..." — the agent does this automatically for every reasoning step.

Querying memory yourself

You can ask the agent to surface memories directly:

"What do I know about Bob Jones?"
→ Returns counterparty record + recent episodic events involving Bob

"What happened in last week's meetings?"
→ Returns episodic events of type meeting_attended from the last 7 days

"What have I committed to recently?"
→ Returns open commitments from counterparty store

"Search my knowledge base for our refund policy"
→ Runs hybrid search across uploaded documents

Or use the Memory Inspector panel (sidebar → Memory) to browse all 5 layers visually.

Privacy and persistence

All memory is stored locally in SQLite — nothing leaves your machine. The database lives at:

Docker volume:  /app/data/mindful-assistant.db
Local path:     ~/.mindful-assistant/data/mindful-assistant.db

To clear specific memory: use the Memory Inspector panel and delete individual entries.
To export: GET /api/memory/export returns a JSON dump of all layers.

The Knowledge Base in detail

1
Upload a document
Go to Knowledge Base → Upload Document. Supports PDF, DOCX, TXT, MD, CSV. Max 50 MB per file.
2
Wait for ingestion
The document is split into chunks (~500 tokens each), each chunk is embedded, and all chunks are stored in the database. The UI shows a progress indicator.

Ask questions

Type a question referencing your document. The agent automatically pulls relevant chunks:

"Based on our product spec, what are the API rate limits?"
"What does our employee handbook say about remote work?"
"Summarize the key terms from the contract I uploaded"

4
Search directly
Use the Knowledge Base panel to run searches without going through the chat:
- Keyword search: exact term matching
- Semantic search: meaning-based matching
- Hybrid: both combined (default)

Embedding providers

AI Partner tries embedding providers in this order (falls back automatically if one isn't configured):

Priority	Provider	Requires
1	OpenAI `text-embedding-3-small`	`OPENAI_API_KEY`
2	Cohere	`COHERE_API_KEY`
3	Ollama (local)	Ollama running locally
4	TF-IDF	Nothing — always available, no API key needed

TF-IDF is keyword-based, not semantic — it always works but is less precise for meaning-based queries. For best results, set OPENAI_API_KEY or COHERE_API_KEY.

Why memory matters​

The 5 layers​

How the agent uses memory​

Querying memory yourself​

Privacy and persistence​

The Knowledge Base in detail​

Embedding providers​