Memory System
Why memory matters
Standard chatbots forget everything when the conversation ends. Ask the same question twice and you get the same answer — no context, no learning, no continuity.
AI Partner uses a 5-layer memory system that persists across conversations, days, and weeks. When you ask about a vendor you spoke to last month, AI Partner pulls up their record — their communication style, open commitments, last contact date — without you repeating it.
The 5 layers
1. Episodic Memory — the event timeline
What it stores: A timestamped record of everything that happens — goals run, emails sent, meetings attended, files created, decisions made.
Think of it as: A diary. Every event is stored with its type, timestamp, content, and metadata.
Example entries:
2026-05-10 09:15 goal_completed "Researched NVIDIA Q1 earnings; created slide deck"
2026-05-10 10:30 email_sent "Replied to sarah@acme.com re: proposal"
2026-05-10 14:00 meeting_attended "Joined Teams call with Sequoia; 45 min; action items extracted"
2026-05-11 07:05 heartbeat_task "Morning briefing: 5 news items delivered to Telegram"
Queryable by: time range, event type, keyword, or semantic similarity.
2. Biographic Facts — semantic knowledge about you
What it stores: Subject / Predicate / Object facts consolidated from the episodic timeline. Facts have confidence scores and are updated over time.
Think of it as: A knowledge graph of things the agent knows to be true about you and your world.
Example facts:
(Alex, works_at, Acme Inc.) confidence: 1.0
(Alex, is_raising, Series A) confidence: 0.95
(Sequoia, is_contact_of, Alex) confidence: 0.9
(Alex, prefers_communication_style, direct+data) confidence: 0.85
(Acme v2.0, launches_by, 2026-07-15) confidence: 0.8
Built automatically from what you tell the agent and what it observes. No manual tagging required.
3. Counterparty Store — your contact graph
What it stores: One stable record per person you interact with, linked across every channel.
Bob at Acme might be bjones@acme.com in email, @bjones in Slack, and user 123456789 in Telegram. The counterparty store unifies all three into one record:
Name: Bob Jones
Company: Acme Inc.
Class: client
Aliases:
- bjones@acme.com (email)
- @bjones (Slack)
- 123456789 (Telegram)
Tone: formal-friendly; responds quickly; prefers bullet points
Last contact: 2026-05-08 via email
Open commitments:
- "Will send revised proposal by May 15" (from meeting on May 8)
Used by: AuthorityPolicy (gating by relationship class), email/DM proxy (personalising reply tone), meeting proxy (recognising who's speaking).
4. Vector Memory — semantic search
What it stores: Embedding vectors for all memory entries, enabling search by meaning rather than exact keywords.
Think of it as: A semantic search index. "Find everything related to our funding round" returns relevant entries even if they don't contain the exact phrase "funding round".
How it works:
- Every memory entry is converted to a vector embedding (OpenAI, Cohere, Ollama, or TF-IDF as fallback)
- At query time, the query is embedded and compared against stored vectors using cosine similarity
- Results are reranked with BM25 (keyword overlap) and MMR (diversity — avoids returning 10 near-identical results)
Automatically used by the agent's reasoning step — no explicit command needed.
5. RAG Knowledge Base — your documents
What it stores: PDFs, Word docs, and other files you upload, chunked and embedded for retrieval.
Think of it as: A searchable library. Upload your company handbook, investor deck, or technical spec — the agent can pull relevant sections at query time.
How to use:
- Go to Knowledge Base in the sidebar
- Upload any PDF, Word doc, or text file
- Wait for ingestion (chunking + embedding, ~30 seconds per document)
- Ask the agent: "Based on our investor deck, what is our go-to-market strategy?"
The agent runs hybrid search (vector + keyword) across your uploaded docs and cites the source.
How the agent uses memory
During the reasoning step of each ReAct iteration, the agent automatically:
- Runs
memory_hybrid_searchfor the current sub-task topic - Retrieves the top-k most relevant memories
- Injects them into the reasoning prompt as context
You don't need to say "remember when..." — the agent does this automatically for every reasoning step.
Querying memory yourself
You can ask the agent to surface memories directly:
"What do I know about Bob Jones?"
→ Returns counterparty record + recent episodic events involving Bob
"What happened in last week's meetings?"
→ Returns episodic events of type meeting_attended from the last 7 days
"What have I committed to recently?"
→ Returns open commitments from counterparty store
"Search my knowledge base for our refund policy"
→ Runs hybrid search across uploaded documents
Or use the Memory Inspector panel (sidebar → Memory) to browse all 5 layers visually.
Privacy and persistence
All memory is stored locally in SQLite — nothing leaves your machine. The database lives at:
Docker volume: /app/data/mindful-assistant.db
Local path: ~/.mindful-assistant/data/mindful-assistant.db
To clear specific memory: use the Memory Inspector panel and delete individual entries.
To export: GET /api/memory/export returns a JSON dump of all layers.
The Knowledge Base in detail
- 1Upload a document
Go to Knowledge Base → Upload Document. Supports PDF, DOCX, TXT, MD, CSV. Max 50 MB per file.
- 2Wait for ingestion
The document is split into chunks (~500 tokens each), each chunk is embedded, and all chunks are stored in the database. The UI shows a progress indicator.
- 3Ask questions
Type a question referencing your document. The agent automatically pulls relevant chunks:
"Based on our product spec, what are the API rate limits?""What does our employee handbook say about remote work?""Summarize the key terms from the contract I uploaded" - 4Search directly
Use the Knowledge Base panel to run searches without going through the chat:
- Keyword search: exact term matching
- Semantic search: meaning-based matching
- Hybrid: both combined (default)
Embedding providers
AI Partner tries embedding providers in this order (falls back automatically if one isn't configured):
| Priority | Provider | Requires |
|---|---|---|
| 1 | OpenAI text-embedding-3-small | OPENAI_API_KEY |
| 2 | Cohere | COHERE_API_KEY |
| 3 | Ollama (local) | Ollama running locally |
| 4 | TF-IDF | Nothing — always available, no API key needed |
TF-IDF is keyword-based, not semantic — it always works but is less precise for meaning-based queries. For best results, set OPENAI_API_KEY or COHERE_API_KEY.