Claude Code pluginMITPart of the AI Agent SDK

Give Claude Code
persistent memory.

Every Claude Code session, prompt, and tool call becomes a searchable memory. Install via the Claude Code marketplace, point it at the hosted TES platform or a local Docker stack, and your future sessions can recall what your past sessions decided.

Install

Three slash commands. Done.

1. Add the marketplace

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk

Claude Code clones the repo and registers the pentatonic-ai marketplace.

2. Install the plugin

/plugin install tes-memory@pentatonic-ai

Installs the tes-memory MCP server and the five lifecycle hooks.

3. Connect a memory backend

/tes-memory:tes-setup

The slash command walks you through npx @pentatonic-ai/ai-agent-sdk init, captures the credentials it prints, and writes them to your config file. For a fully-local stack, run npx @pentatonic-ai/ai-agent-sdk memory instead — see the local-mode config below.

Configuration

Hosted or local — same plugin

Config lives at ~/.claude/tes-memory.local.md or ~/.claude-pentatonic/tes-memory.local.md. The MCP server checks both paths and respects CLAUDE_CONFIG_DIR if set.

Hosted TES

tes-memory.local.md
---
tes_endpoint: https://your-client.api.pentatonic.com
tes_client_id: your-client
tes_api_key: tes_your-client_xxxxx
tes_user_id: you@company.com
---

Get these values from /signup or by running npx @pentatonic-ai/ai-agent-sdk init.

Local memory

tes-memory.local.md
---
mode: local
memory_url: http://localhost:3333
---

The local CLI (npx @pentatonic-ai/ai-agent-sdk memory) writes this file for you. It also stands up Docker containers for PostgreSQL + pgvector + Ollama and pulls the embedding + chat models.

Hooks

Five lifecycle events

The plugin registers handlers for the five Claude Code hook events. Together they form the per-session timeline that lands in TES.

SessionStart

Fires when Claude Code opens a session. Captures session metadata for later linkage.

UserPromptSubmit

Fires before each user message is sent to the model. The plugin can inject relevant memories as context here.

PostToolUse

Fires after each tool call. Captures the tool name, input, and result so the session timeline is complete.

Stop

Fires when the assistant finishes a response. Emits a CHAT_TURN with token totals.

SessionEnd

Fires when the session closes. Finalises the session record.

How memory lands in the model

Written into Claude Code's native MEMORY.md — not an injected note

Most "memory" plugins inject context as additionalContext on each prompt. The model reliably treats that as supplementary notes — useful at best, dismissed at worst. We hit the same wall, then shipped a fix that uses Claude Code's own trust path.

Most plugins

Inject memories as additional context on each turn. The model often hedges or ignores them — even with explicit "AUTHORITATIVE SOURCE" framing.

User: what car do I drive?
Model: I don't have access to that information.

tes-memory plugin

Writes retrieved memories into Claude Code's MEMORY.md at~/.claude-pentatonic/projects/<slug>/memory/. Claude Code auto-loads it and treats it as the model's own trusted memory.

User: what car do I drive?
Model: A Subaru and a Hyundai.

Shipped in @pentatonic-ai/ai-agent-sdk commit 92c6cca — verifiable in the repo. Pentatonic memory entries are written into a file Claude Code already trusts; we don't fight the model for attention, we use the same channel it uses for its own persistent context.

MCP tools

Three tools, one memory store

The tes-memory MCP server exposes these tools to Claude. The auto-search hook on UserPromptSubmit calls search_memories for you, but Claude can invoke any of these directly.

search_memories

Semantic search over your team's shared knowledge. Claude can call this directly when it needs context from past conversations, decisions, or debugging sessions.

Args: query (required), userId? (filter), limit? (default 10)

// Claude can invoke directly when it needs prior context:
search_memories({
  query: "how we handle auth in the API",
  limit: 5,
})
store_memory

Explicitly persist a memory into the episodic layer. Use for decisions, architectural choices, or debugging solutions you want findable in future sessions. Hosted TES embeds with NV-Embed-v2 (4096-dim); local mode uses whatever embedder you configured (default: nomic-embed-text, 768-dim).

Args: content (required), metadata? (object — e.g. { topic, type })

store_memory({
  content: "JWT 1h expiry. Refresh tokens in httpOnly cookies.",
  metadata: { topic: "auth", type: "decision" },
})
list_memory_layers

Inspect the four default layers on your tenant and how full each is. Layers are: episodic (recent events, fast decay, capacity 10k), semantic (consolidated knowledge, slow decay, capacity 5k), procedural (how-to, almost no decay, capacity 2k), working (temporary, fast decay, capacity 500). Defined in the SDK at packages/memory/src/layers.js.

Args: no arguments

list_memory_layers()
// → episodic: 142/10000
//   semantic: 38/5000
//   procedural: 11/2000
//   working: 3/500

Verify

Try it

# Inside a Claude Code session, ask:
"Search my memories for anything about authentication"

# Or store something explicitly:
"Remember that we use 1-hour JWT expiry with httpOnly refresh cookies"

Future sessions — yours and your teammates' — will surface those memories whenever they're relevant. Full installation walkthrough on GitHub.

Claude Code plugin

Never lose context between sessions again

Free tier covers 1M proxied input tokens per month — enough for the average solo dev's Claude Code usage. Hosted memory + dashboard included.