From text to vectors
When you store a memory, MemoClaw converts the text into a 512-dimensional vector using OpenAI’stext-embedding-3-small model. These vectors capture semantic meaning — similar concepts produce similar vectors, even with different wording.
For example, “user prefers dark mode” and “they like dark themes” will have vectors that are very close together, even though the words are completely different.
Vector similarity search
When you recall, your query is also converted to a vector. PostgreSQL with pgvector finds the closest stored vectors using cosine distance. The HNSW (Hierarchical Navigable Small World) index makes this fast even with millions of memories.Scoring formula
Hybrid recall scoring (4-signal approach):Signals:
- vector_sim: Cosine similarity (0–1) — primary semantic signal
- keyword_match: Full-text/BM25 match, normalized (0–1) — exact term matches
- recency:
exp(-age_days / 30)— temporal freshness - context_importance: Importance dynamically boosted by query relevance
- access_boost:
min(1 + access_count × 0.1, 2.0)— frequently recalled memories rank higher - type_decay: Exponential decay based on memory type half-life (correction: 180d, preference: 180d, decision: 90d, project: 30d, observation: 14d, general: 60d). Pinned memories are exempt from decay. Memories with more relations decay slower.
Embedding cache
MemoClaw maintains an LRU cache of 1,000 embeddings. Repeated text won’t consume additional OpenAI API tokens. Cache hits are instant. This is particularly useful for agents that recall the same queries across sessions — the embedding is computed once and reused.Namespaces
Use namespaces to isolate memories per project or context. The default namespace is"default".
Suggested strategy:
| Namespace pattern | Use case |
|---|---|
default | General user info and preferences |
project-{name} | Project-specific knowledge |
session-{date} | Session summaries |
default namespace without any overlap.
Tags and metadata filtering
Attach tags to memories for category-based filtering. Recall supports filtering by tags (match any) and by date (after).
Metadata is stored as JSONB and supports:
- Up to 20 keys per memory
- Up to 3 levels of nesting
- Any JSON-compatible values
When to store
When NOT to store
Best practices
- Be specific — “Ana prefers VSCode with vim bindings” beats “user likes editors”
- Add metadata — Tags enable filtered recall later
- Set importance — 0.9+ for critical info, 0.5 for nice-to-have
- Use namespaces — Isolate memories per project
- Don’t duplicate — Recall before storing similar content
- Respect privacy — Never store secrets