From text to vectors
When you store a memory, MemoClaw converts the text into a 512-dimensional vector using OpenAI’stext-embedding-3-small model. These vectors capture semantic meaning — similar concepts produce similar vectors, even with different wording.
For example, “user prefers dark mode” and “they like dark themes” will have vectors that are very close together, even though the words are completely different.
Vector similarity search
When you recall, your query is also converted to a vector. PostgreSQL with pgvector finds the closest stored vectors using cosine distance. The HNSW (Hierarchical Navigable Small World) index makes this fast even with millions of memories.Scoring formula
Weighted recall scoring:
- The base similarity
(1 - cosine_distance)ranges from 0 to 1 - Importance (0 to 1) provides a 0–30% boost
- At equal similarity, a memory with importance
1.0scores ~43% higher than one with importance0
Embedding cache
MemoClaw maintains an LRU cache of 1,000 embeddings. Repeated text won’t consume additional OpenAI API tokens. Cache hits are instant. This is particularly useful for agents that recall the same queries across sessions — the embedding is computed once and reused.Namespaces
Use namespaces to isolate memories per project or context. The default namespace is"default".
Suggested strategy:
| Namespace pattern | Use case |
|---|---|
default | General user info and preferences |
project-{name} | Project-specific knowledge |
session-{date} | Session summaries |
default namespace without any overlap.
Tags and metadata filtering
Attach tags to memories for category-based filtering. Recall supports filtering by tags (match any) and by date (after).
Metadata is stored as JSONB and supports:
- Up to 20 keys per memory
- Up to 3 levels of nesting
- Any JSON-compatible values
When to store
When NOT to store
Best practices
- Be specific — “Ana prefers VSCode with vim bindings” beats “user likes editors”
- Add metadata — Tags enable filtered recall later
- Set importance — 0.9+ for critical info, 0.5 for nice-to-have
- Use namespaces — Isolate memories per project
- Don’t duplicate — Recall before storing similar content
- Respect privacy — Never store secrets