The memory layer

The right memory in every prompt — so your AI answers from context, not tool calls.

The differentiator

Same memory, every source

Other memory layers index chat history. TES indexes code, Slack, Gmail, calendar, and docs — and links them. Read a function, surface the Slack thread that decided it.

Six layers, one engine

Six layers, each tuned for a different kind of recall. L2 (HybridRAG) orchestrates them into one ranked result.

L1Working memory

System files

The rules your project always honours — read first, every session.

Markdown like MEMORY.md and CLAUDE.md. Human-readable, git-friendly, zero overhead.

L2Reasoning

HybridRAG

The orchestrator. Asks the other layers, then fuses their answers into one ranked list.

Graph + vector + full-text, combined via reciprocal rank fusion and confidence scoring.

L3Semantic memory

Knowledge graph

Who decided what, with whom — recall by relationship, not just resemblance.

Hyperedges link people, decisions, and code in a single multi-hop traversal.

L4Episodic memory

Vector store

Semantic recall — different words, same meaning.

4,096-dimensional embeddings, chunked and reranked by a cross-encoder.

L5Social memory

Comms layer

The conversation behind the code — the thread, email, or meeting that decided it.

Slack, Gmail, and calendar, linked back into the knowledge graph automatically.

L6Procedural memory

Document store

Runbooks, RFCs, and PDFs as first-class context — not file dumps.

Full-text search with cross-encoder reranking. Rarely-read docs decay over time.

Cognitive view

Hot memories consolidate.
Cold ones decay.

Six layers say where memory lives. Four cognitive types say how it's used — each with its own capacity and decay rate.

Episodic

Recent events and turns

capacity 10,000
decay 0.05

Semantic

Consolidated knowledge

capacity 5,000
decay 0.001

Procedural

How-to and runbooks

capacity 2,000
decay 0.0001

Working

Temporary scratchpad

capacity 500
decay 0.2

Retrieval precision

Atomic facts,
not paragraphs.

Raw turns get distilled into atoms — standalone facts, back-pointed to source. Search ranks atoms above verbose turns.

Toggle distillation to see the difference

tes-memory — retrieval
why did we drop redis for sessions?
retrieving
We leaned on Redis for a bunch of things last year — the queue worker rollout in March, a long stretch tuning eviction policies and TTLs, the Memcached vs Redis benchmark for the rate-limiter, and a short-lived Redis Streams pilot for the event bus that got rolled back. Sessions came up in a few of those threads too.

Start using it

One env var or one plugin install

Same memory layer for proxy and plugin. Free tier: 1M proxied input tokens / month.

Get API key Read the benchmark