Give Claude Code
persistent memory.
Every Claude Code session, prompt, and tool call becomes a searchable memory. Install via the Claude Code marketplace, point it at the hosted TES platform or a local Docker stack, and your future sessions can recall what your past sessions decided.
Install
Three slash commands. Done.
1. Add the marketplace
/plugin marketplace add Pentatonic-Ltd/ai-agent-sdkClaude Code clones the repo and registers the pentatonic-ai marketplace.
2. Install the plugin
/plugin install tes-memory@pentatonic-aiInstalls the tes-memory MCP server and the five lifecycle hooks.
3. Connect a memory backend
/tes-memory:tes-setupThe slash command walks you through npx @pentatonic-ai/ai-agent-sdk init, captures the credentials it prints, and writes them to your config file. For a fully-local stack, run npx @pentatonic-ai/ai-agent-sdk memory instead — see the local-mode config below.
Configuration
Hosted or local — same plugin
Config lives at ~/.claude/tes-memory.local.md or ~/.claude-pentatonic/tes-memory.local.md. The MCP server checks both paths and respects CLAUDE_CONFIG_DIR if set.
Hosted TES
---
tes_endpoint: https://your-client.api.pentatonic.com
tes_client_id: your-client
tes_api_key: tes_your-client_xxxxx
tes_user_id: you@company.com
---Get these values from /signup or by running npx @pentatonic-ai/ai-agent-sdk init.
Local memory
---
mode: local
memory_url: http://localhost:3333
---The local CLI (npx @pentatonic-ai/ai-agent-sdk memory) writes this file for you. It also stands up Docker containers for PostgreSQL + pgvector + Ollama and pulls the embedding + chat models.
Hooks
Five lifecycle events
The plugin registers handlers for the five Claude Code hook events. Together they form the per-session timeline that lands in TES.
SessionStartFires when Claude Code opens a session. Captures session metadata for later linkage.
UserPromptSubmitFires before each user message is sent to the model. The plugin can inject relevant memories as context here.
PostToolUseFires after each tool call. Captures the tool name, input, and result so the session timeline is complete.
StopFires when the assistant finishes a response. Emits a CHAT_TURN with token totals.
SessionEndFires when the session closes. Finalises the session record.
How memory lands in the model
Written into Claude Code's native MEMORY.md — not an injected note
Most "memory" plugins inject context as additionalContext on each prompt. The model reliably treats that as supplementary notes — useful at best, dismissed at worst. We hit the same wall, then shipped a fix that uses Claude Code's own trust path.
Most plugins
Inject memories as additional context on each turn. The model often hedges or ignores them — even with explicit "AUTHORITATIVE SOURCE" framing.
User: what car do I drive?
Model: I don't have access to that information.tes-memory plugin
Writes retrieved memories into Claude Code's MEMORY.md at~/.claude-pentatonic/projects/<slug>/memory/. Claude Code auto-loads it and treats it as the model's own trusted memory.
User: what car do I drive?
Model: A Subaru and a Hyundai.Shipped in @pentatonic-ai/ai-agent-sdk commit 92c6cca — verifiable in the repo. Pentatonic memory entries are written into a file Claude Code already trusts; we don't fight the model for attention, we use the same channel it uses for its own persistent context.
MCP tools
Three tools, one memory store
The tes-memory MCP server exposes these tools to Claude. The auto-search hook on UserPromptSubmit calls search_memories for you, but Claude can invoke any of these directly.
search_memoriesSemantic search over your team's shared knowledge. Claude can call this directly when it needs context from past conversations, decisions, or debugging sessions.
Args: query (required), userId? (filter), limit? (default 10)
// Claude can invoke directly when it needs prior context:
search_memories({
query: "how we handle auth in the API",
limit: 5,
})store_memoryExplicitly persist a memory into the episodic layer. Use for decisions, architectural choices, or debugging solutions you want findable in future sessions. Hosted TES embeds with NV-Embed-v2 (4096-dim); local mode uses whatever embedder you configured (default: nomic-embed-text, 768-dim).
Args: content (required), metadata? (object — e.g. { topic, type })
store_memory({
content: "JWT 1h expiry. Refresh tokens in httpOnly cookies.",
metadata: { topic: "auth", type: "decision" },
})list_memory_layersInspect the four default layers on your tenant and how full each is. Layers are: episodic (recent events, fast decay, capacity 10k), semantic (consolidated knowledge, slow decay, capacity 5k), procedural (how-to, almost no decay, capacity 2k), working (temporary, fast decay, capacity 500). Defined in the SDK at packages/memory/src/layers.js.
Args: no arguments
list_memory_layers()
// → episodic: 142/10000
// semantic: 38/5000
// procedural: 11/2000
// working: 3/500Verify
Try it
# Inside a Claude Code session, ask:
"Search my memories for anything about authentication"
# Or store something explicitly:
"Remember that we use 1-hour JWT expiry with httpOnly refresh cookies"Future sessions — yours and your teammates' — will surface those memories whenever they're relevant. Full installation walkthrough on GitHub.
Claude Code plugin
Never lose context between sessions again
Free tier covers 1M proxied input tokens per month — enough for the average solo dev's Claude Code usage. Hosted memory + dashboard included.