Pricing

Per-token. Lower than upstream.

$0.50 per 1M input tokens — about a sixth of Anthropic's $3 / 1M for Sonnet direct. Then the preamble compresses every turn on top, so your total token count drops too. Two wins, both visible on the invoice.

On Claude Code, Cursor, or Codex? The Claude Code plugin adds the same memory locally — free.

Savings calculator

What would your bill look like on TES?

Two numbers, not one. The TES invoice and the upstream invoice show up separately — audit the gap against what you pay today.

$500/mo
$50$1k$10k$100k

Bigger bill? Talk to us — Enterprise rate is $0.30 / 1M at volume.

Median input-token reduction: 27% (per 2026-04-24 benchmark).

Estimated new bill

Your TES bill (per-token)

$61/mo

$0.50 / 1M proxied input tokens, $20/mo minimum.

Your direct upstream bill (after compression)

$364/mo

You still pay your provider — just for ~27% fewer input tokens.

Total monthly saving vs current

$75(15%)

Compared to today: $500/mo → $425/mo combined.

Get API key — free tier

Assumes current spend is predominantly input-token cost billed at a Sonnet-equivalent rate of $3.00/1M. Output tokens, cached reads, and batch discounts are not modelled. Numbers are a preview based on early benchmarks — final per-workload data publishes alongside the benchmark page.

Pricing

Per-token. Lower than upstream. Audit by comparing two invoices.

We charge a per-token rate that's lower than going direct to Anthropic or OpenAI — and your total token count drops because the preamble compresses every turn. Customer wins twice. Both wins are visible on the bill.

Free

$0

Solo devs, weekend projects, and Claude Code / Cursor / Codex subscribers using the memory plugin.

  • 1M proxied input tokens / month
  • Anthropic Messages + OpenAI Chat Completions
  • Claude Code plugin — unlimited memory + sessions
  • Bring-your-own retrieval source (URLs, files)
  • Token-savings dashboard + request log
  • Soft-fail to upstream on TES error
  • Discord support
Get API key

Pro

$0.50per 1M input tokens

$20/mo minimum

Devs and small teams paying $50–$500/mo to LLM providers.

  • Per-token meter from $0 — minimum covers light use
  • Same routes + per-tenant memory layer
  • Per-project breakdown, exportable CSV
  • X-TES-Mode: passthrough for A/B comparisons
  • Email support, 1-business-day
Start with $20/mo

Enterprise

$0.30per 1M at volume

$1k/mo minimum, annual commit

Companies with a six- or seven-figure annual AI bill.

  • Volume per-token rate, predictable ceiling
  • Custom upstreams (MiniMax, vLLM, llama.cpp, your own inference)
  • Custom KGs / vector stores / private corpora
  • SLA, dedicated regions
  • Per-team breakdown, SSO, audit log
  • Slack channel, dedicated engineer
Talk to sales

Reference: Anthropic lists Sonnet input at $3 / 1M tokens direct. Our Pro per-token rate is $0.50 / 1M — about a sixth of that, before the compression saving. You can audit the gap by putting our invoice next to your direct upstream invoice.

Compare plans

FeatureFreeProEnterprise
Free input tokens / month1,000,000Metered from $0Per contract
Per-token rate$0.50 / 1M input$0.30 / 1M input at volume
Monthly minimumNone$20/mo$1,000/mo
CommitmentNoneMonth-to-monthAnnual commit
Proxy upstreamsAnthropic Messages + OpenAI Chat CompletionsSame+ MiniMax, vLLM, llama.cpp, your own inference
Claude Code plugin (memory)Included — unlimitedIncluded + per-project breakdownIncluded + team-shared memory, SSO
Retrieval sourceBring your own (URLs, files)+ persistent memory, per-project+ custom KGs, vector stores, private corpora
Latency targetBest-effortBest-effort + monitoringSLA, dedicated regions
DashboardSavings + request log+ per-project, CSV export+ per-team, SSO, audit log
SupportDiscordEmail, 1-business-daySlack, dedicated engineer, SLA

Frequently asked questions

Get started

Change one environment variable. Watch the bill drop.

Free tier covers a weekend project. Pro is per-token, $20/mo minimum. No credit card needed for Free.