# Thing Event System (TES) > A drop-in token-compression layer for any LLM. TES proxies requests to Anthropic, OpenAI, MiniMax, or any OpenAI-compatible endpoint — retrieving the context the model would have re-derived this turn and injecting it as a preamble. 95% fewer input tokens on code and agentic workloads; see /benchmarks for the full reproducible methodology. TES is built by Pentatonic. It sits between your application and your LLM provider. Same SDK, same model, same response shape — the only diff is one environment variable. ## The mechanism Every turn, your LLM re-reads files, re-runs searches, re-derives context it already had. TES intercepts the request, fetches that context from cache / persistent memory / your retrieval source, injects it as a preamble, and forwards the call. The model answers from what's in front of it instead of tool-calling for it. ## Quickstart — change one env var The proxy at `llm.api.pentatonic.com` exposes two routes today: `POST /v1/messages` (Anthropic Messages, forwards to api.anthropic.com) and `POST /v1/chat/completions` (OpenAI Chat Completions, forwards to api.openai.com). Pass `X-TES-Token: tes__` as a default header on your client. ```bash # Anthropic SDK (TypeScript / Python) # The SDK appends /v1/messages to the base URL. ANTHROPIC_API_KEY=sk-ant-... ANTHROPIC_BASE_URL=https://llm.api.pentatonic.com TES_API_KEY=tes__ # OpenAI SDK or any OpenAI-compatible client # The SDK appends /chat/completions to the base URL — note the /v1. OPENAI_API_KEY=sk-... OPENAI_BASE_URL=https://llm.api.pentatonic.com/v1 TES_API_KEY=tes__ ``` Other OpenAI-compatible upstreams (MiniMax, vLLM, custom inference) are routed via dedicated upstream config on the Enterprise tier — same wire shape, different forwarding URL. Code stays the same. Response shape stays the same. Token count drops. ## Pricing - **Free**: $0, 1M proxied input tokens / month. No credit card. - **Pro**: $0.50 per 1M input tokens proxied. $20/mo minimum. - **Enterprise**: $0.30 per 1M at volume. Annual commit. $1k/mo minimum. Per-token pricing — same shape as Anthropic or OpenAI direct, at roughly half the rate. Customer wins twice: lower per-token rate + fewer tokens per turn from the preamble compression. Both wins are visible by comparing two invoices. ## Compatibility - Anthropic (Sonnet, Haiku, Opus) - OpenAI (GPT-4o, GPT-5, o-series) - MiniMax - Any OpenAI-compatible endpoint (vLLM, llama.cpp, custom inference) ## What we don't do - We don't swap your model. - We don't read your responses. - We don't change response shape. - We don't make tool calls on your behalf unless you explicitly enable retrieval policies. Send `X-TES-Mode: passthrough` on any request and TES forwards to upstream untouched. --- ## Legacy: TES GraphQL API (internal — powers the memory layer) TES also exposes a GraphQL API that powers the memory layer behind the proxy. This is primarily an internal interface — customers don't integrate with it directly; they use the proxy (above). The GraphQL surface models the domain as Things with lifecycle, Holders, Locations, Products, Shipments, and Payments, and is retained here for existing integrations and documentation completeness. ## API - Endpoint: `POST /api/graphql` - Auth: OAuth 2.0 Bearer Token (`Authorization: Bearer `) - Required header: `X-Client-Id: ` - Format: GraphQL over HTTPS (JSON request/response) ## Core Entities ### Things Physical items tracked through 26 lifecycle stages. Each thing has a holder (who has it), a location (where it is), and optionally a product (what it is). AI enrichment automatically identifies brand, model, condition, and market value from uploaded images. ### Holders People or organisations that hold things: customers, warehouses, stores, carriers, processors, manufacturers. Custody transfers are recorded in the underlying store. ### Locations Physical places: warehouses, stores, processing centres, distribution centres. Things move between locations, and every move is tracked. ### Products The product catalog. Things are instances of products. Products have brand, category, SKU, tags, and features. ### Shipments Logistics tracking with multi-provider support, tracking numbers, status history, cost, and delivery confirmation. ### Payments Financial transactions linked to things: prepay, COD, refund. Status lifecycle from pending through processing to completed. ### Events (internal) Internal audit records. Every mutation on the GraphQL side produces an event with entityId, entityType, eventType, timestamp, and payload. Customers using the proxy do not interact with this directly. ## Key Queries - `things(limit, offset)` — list things with pagination - `thing(id)` — get a thing with all enriched fields (vision, pricing, valuation) - `holders(limit, offset)` — list holders - `holder(id)` — get holder details - `locations(limit, offset)` — list locations - `location(id)` — get location details - `products(limit, offset)` — list products - `product(id)` — get product details - `shipments(filter, limit, offset)` — list shipments - `payments(filter, limit, offset)` — list payments - `events(filter, limit, offset)` — list events - `searchThings(input)` — vector similarity search across things - `searchProducts(input)` — vector similarity search across products - `thingsCountByStage(stages)` — analytics by lifecycle stage - `eventStats` — event counts by type and time period - `edgeHistory(entityId)` — relationship change history ## Key Mutations - `createThing(input)` — create a thing (physical or digital) - `updateThing(id, input)` — update thing attributes - `addThingStatus(id, input)` — advance lifecycle stage - `transferThing(id, input)` — transfer custody to new holder - `changeThingLocation(id, input)` — move to new location - `uploadThingImage(id, input)` — upload image (triggers AI enrichment) - `createHolder(input)` — create a holder - `createLocation(input)` — create a location - `createProduct(input)` — create a product - `createShipment(input)` — create a shipment - `createPayment(input)` — create a payment ## Lifecycle Stages Things progress through these stages: manufactured → sourced → in_stock → in_transit → delivered → sold → in_use → captured → identified → valued → returned → received → processing → inspecting → refurbishing → repairing → processed → certified → listed → resold → recycled → donated → disposed Problem states: issue, rejected, lost ## AI Enrichment Pipeline When an image is uploaded to a thing, TES automatically runs: 1. Vision Analysis — identifies brand, model, colorway, category, condition 2. Market Pricing — returns price range (low/mid/high) with confidence 3. Valuation — estimates resale value 4. Name Generation — auto-generates name from vision data 5. Text Embedding — 1024-dim vector for semantic search 6. Product Matching — auto-links to catalog product or creates new ## Vector Search Search things and products by natural language query, similar item ID, or image. Uses BGE-M3 embeddings (1024 dimensions) with cosine similarity ranking. ```graphql query { searchThings(input: { query: "vintage leather jacket", min_score: 0.7, limit: 10 }) { items { score thing { id name vision { brand category condition { grade } } } } } } ``` ## SDKs ### JavaScript / TypeScript ```bash npm install @pentatonic-ai/ai-agent-sdk ``` ```javascript import { TESClient } from '@pentatonic-ai/ai-agent-sdk'; const tes = new TESClient({ apiKey: 'tes_...', clientId: 'your-client-id', }); // Auto-wrap any LLM client for observability const ai = tes.wrap(new OpenAI()); ``` ### Python ```bash pip install pentatonic-ai-agent-sdk ``` ```python from pentatonic_ai_agent_sdk import TESClient tes = TESClient( api_key="tes_...", client_id="your-client-id", ) # Auto-wrap any LLM client ai = tes.wrap(openai_client) ``` Both SDKs provide LLM observability (token usage, tool calls, conversation tracking) and direct TES API access. - npm: https://www.npmjs.com/package/@pentatonic-ai/ai-agent-sdk - PyPI: https://pypi.org/project/pentatonic-ai-agent-sdk/ - GitHub: https://github.com/Pentatonic-Ltd/ai-agent-sdk ## Integration ### MCP (Model Context Protocol) TES ships an MCP server (`tes-memory`) bundled with the AI Agent SDK. Tools: `search_memories(query, userId?, limit?)`, `store_memory(content, metadata?)`, `list_memory_layers()`. Install in Claude Code: ```text /plugin marketplace add Pentatonic-Ltd/ai-agent-sdk /plugin install tes-memory@pentatonic-ai /tes-memory:tes-setup ``` Config lives at `~/.claude/tes-memory.local.md` (or `~/.claude-pentatonic/tes-memory.local.md` on aliased installs). For the legacy domain GraphQL surface (Things lifecycle, Holders, Products), TES also provides a GraphQL endpoint at `.api.pentatonic.com/api/graphql` with `Authorization: Bearer `. ### OAuth 2.0 - Authorization Code Flow with PKCE (user-facing apps) - Client Credentials Flow (server-to-server) - Token format: `tes__` ### Webhooks HMAC-SHA256 signed event delivery to your endpoints. ## Links - Website: https://thingeventsystem.ai - How it works: https://thingeventsystem.ai/how-it-works - Benchmarks: https://thingeventsystem.ai/benchmarks - Pricing: https://thingeventsystem.ai/pricing - Sign up: https://thingeventsystem.ai/signup - Documentation (proxy quickstart + reference): https://thingeventsystem.ai/docs - Proxy endpoint: https://llm.api.pentatonic.com - GraphQL Playground (legacy): https://thingeventsystem.ai/graphql - E2E Demo (legacy): https://thingeventsystem.ai/e2e - Status: https://thingeventsystem.ai/status - npm (JS/TS): https://www.npmjs.com/package/@pentatonic-ai/ai-agent-sdk - PyPI (Python): https://pypi.org/project/pentatonic-ai-agent-sdk/ - GitHub: https://github.com/Pentatonic-Ltd/ai-agent-sdk - SDK docs: https://thingeventsystem.ai/sdk - Discord: https://discord.gg/QZJe9FtkWj - Parent company: https://pentatonic.com