Architecture
turbomem is a small, embeddable agent-memory library. There is no server, no sidecar process, and no native compilation in v0.2, everything runs inside your Node/Bun process.
High-level flow
messages ──► Extractor (LLM) ──► facts ──► EmbeddingAdapter ──► vectors ──► StorageAdapter
│
query ──────────────► EmbeddingAdapter ──► query vector ──► StorageAdapter.search ┘ ──► ranked resultsadd(messages, scope)runs LLM-based fact extraction, embeds each fact in a batch, and inserts the fact + vector + scope into storage.search(query, scope)embeds the query and performs a vector similarity search filtered by scope.
Components
TurboMemory
The single public entry point. It resolves the configured adapters, owns initialisation state, and orchestrates extraction → embedding → storage. All public methods assert initialisation first and throw NotInitialisedError otherwise.
Embedding adapters
Implement EmbeddingAdapter (embed, embedBatch, dimensions):
- OpenAI (
text-embedding-3-small, 1536d default) — batches up to 100 inputs per request, chunking larger batches withPromise.all. - Transformers (local WASM,
Xenova/all-MiniLM-L6-v2, 384d) — lazily loads and caches the model as a module-level singleton, processes batches sequentially because WASM is single-threaded.
Storage adapters
Implement StorageAdapter (init, insert, search, getAll, delete, deleteAll):
- PGlite (default), WASM Postgres +
pgvector. The vector column dimension is set from the embedding adapter'sdimensionsatinit(). Aturbomem_metatable records the dimension so a later dimension change fails loudly withDimensionMismatchError. Similarity search uses the cosine distance operator (<=>) with an HNSW index.
Extractor
Turns a conversation into discrete, third-person facts about the user using an LLM (OpenAI or Anthropic). Extraction is non-fatal: parse/transport failures log a warning and yield [] rather than throwing.
Scoping
Every memory is tagged with optional userId, agentId, and sessionId. Reads and deletes filter by whatever scope fields are provided. This is the extent of multi-tenancy, there is no auth layer.
Error handling
All errors derive from TurboMemError with a stable code (NOT_INITIALISED, EMBEDDING_FAILED, STORAGE_FAILED, EXTRACTION_FAILED, DIMENSION_MISMATCH, INVALID_CONFIG, INVALID_INPUT). Only extraction failures are swallowed; everything else propagates.
See the API reference for error class details.
Adapters as separate entrypoints
Framework integrations (@turbomem/mastra, @turbomem/vercel-ai) are published as separate packages so the core stays framework-agnostic and dependency-light.