View benchmarks

The memory layer

The right memory in every prompt — so your AI answers from context, not tool calls.

See how it works

The differentiator

Same memory, every source

Other memory layers index chat history. TES indexes code, Slack, Gmail, calendar, and docs — and links them. Read a function, surface the Slack thread that decided it.

memoryCodeSlackGmailCalendarDocs

Six layers, one engine

Six layers, each tuned for a different kind of recall. L2 (HybridRAG) orchestrates them into one ranked result.

L1Working memory

System files

The rules your project always honours — read first, every session.

Markdown like MEMORY.md and CLAUDE.md. Human-readable, git-friendly, zero overhead.

L2Reasoning

HybridRAG

The orchestrator. Asks the other layers, then fuses their answers into one ranked list.

Graph + vector + full-text, combined via reciprocal rank fusion and confidence scoring.

L3Semantic memory

Knowledge graph

Who decided what, with whom — recall by relationship, not just resemblance.

Hyperedges link people, decisions, and code in a single multi-hop traversal.

L4Episodic memory

Vector store

Semantic recall — different words, same meaning.

4,096-dimensional embeddings, chunked and reranked by a cross-encoder.

L5Social memory

Comms layer

The conversation behind the code — the thread, email, or meeting that decided it.

Slack, Gmail, and calendar, linked back into the knowledge graph automatically.

L6Procedural memory

Document store

Runbooks, RFCs, and PDFs as first-class context — not file dumps.

Full-text search with cross-encoder reranking. Rarely-read docs decay over time.

Cognitive view

Hot memories consolidate.
Cold ones decay.

Six layers say where memory lives. Four cognitive types say how it's used — each with its own capacity and decay rate.

Episodic

Recent events and turns

capacity 10,000
decay 0.05

Semantic

Consolidated knowledge

capacity 5,000
decay 0.001

Procedural

How-to and runbooks

capacity 2,000
decay 0.0001

Working

Temporary scratchpad

capacity 500
decay 0.2

Retrieval precision

Atomic facts,
not paragraphs.

Raw turns get distilled into atoms — standalone facts, back-pointed to source. Search ranks atoms above verbose turns.

Toggle distillation to see the difference

tes-memory — retrieval
why did we drop redis for sessions?
retrieving
We leaned on Redis for a bunch of things last year — the queue worker rollout in March, a long stretch tuning eviction policies and TTLs, the Memcached vs Redis benchmark for the rate-limiter, and a short-lived Redis Streams pilot for the event bus that got rolled back. Sessions came up in a few of those threads too.

Start using it

One env var or one plugin install

Same memory layer for proxy and plugin. Free tier: 1M proxied input tokens / month.