Pricing
Per-token. Lower than upstream.
$0.50 per 1M input tokens — about a sixth of Anthropic's $3 / 1M for Sonnet direct. Then the preamble compresses every turn on top, so your total token count drops too. Two wins, both visible on the invoice.
On Claude Code, Cursor, or Codex? The Claude Code plugin adds the same memory locally — free.
Savings calculator
What would your bill look like on TES?
Two numbers, not one. The TES invoice and the upstream invoice show up separately — audit the gap against what you pay today.
Median input-token reduction: 27% (per 2026-04-24 benchmark).
Estimated new bill
Your TES bill (per-token)
$61/mo
$0.50 / 1M proxied input tokens, $20/mo minimum.
Your direct upstream bill (after compression)
$364/mo
You still pay your provider — just for ~27% fewer input tokens.
Total monthly saving vs current
$75(15%)
Compared to today: $500/mo → $425/mo combined.
Assumes current spend is predominantly input-token cost billed at a Sonnet-equivalent rate of $3.00/1M. Output tokens, cached reads, and batch discounts are not modelled. Numbers are a preview based on early benchmarks — final per-workload data publishes alongside the benchmark page.
Pricing
Per-token. Lower than upstream. Audit by comparing two invoices.
We charge a per-token rate that's lower than going direct to Anthropic or OpenAI — and your total token count drops because the preamble compresses every turn. Customer wins twice. Both wins are visible on the bill.
Free
Solo devs, weekend projects, and Claude Code / Cursor / Codex subscribers using the memory plugin.
- 1M proxied input tokens / month
- Anthropic Messages + OpenAI Chat Completions
- Claude Code plugin — unlimited memory + sessions
- Bring-your-own retrieval source (URLs, files)
- Token-savings dashboard + request log
- Soft-fail to upstream on TES error
- Discord support
Pro
$20/mo minimum
Devs and small teams paying $50–$500/mo to LLM providers.
- Per-token meter from $0 — minimum covers light use
- Same routes + per-tenant memory layer
- Per-project breakdown, exportable CSV
- X-TES-Mode: passthrough for A/B comparisons
- Email support, 1-business-day
Enterprise
$1k/mo minimum, annual commit
Companies with a six- or seven-figure annual AI bill.
- Volume per-token rate, predictable ceiling
- Custom upstreams (MiniMax, vLLM, llama.cpp, your own inference)
- Custom KGs / vector stores / private corpora
- SLA, dedicated regions
- Per-team breakdown, SSO, audit log
- Slack channel, dedicated engineer
Reference: Anthropic lists Sonnet input at $3 / 1M tokens direct. Our Pro per-token rate is $0.50 / 1M — about a sixth of that, before the compression saving. You can audit the gap by putting our invoice next to your direct upstream invoice.
Compare plans
| Feature | Free | Pro | Enterprise |
|---|---|---|---|
| Free input tokens / month | 1,000,000 | Metered from $0 | Per contract |
| Per-token rate | — | $0.50 / 1M input | $0.30 / 1M input at volume |
| Monthly minimum | None | $20/mo | $1,000/mo |
| Commitment | None | Month-to-month | Annual commit |
| Proxy upstreams | Anthropic Messages + OpenAI Chat Completions | Same | + MiniMax, vLLM, llama.cpp, your own inference |
| Claude Code plugin (memory) | Included — unlimited | Included + per-project breakdown | Included + team-shared memory, SSO |
| Retrieval source | Bring your own (URLs, files) | + persistent memory, per-project | + custom KGs, vector stores, private corpora |
| Latency target | Best-effort | Best-effort + monitoring | SLA, dedicated regions |
| Dashboard | Savings + request log | + per-project, CSV export | + per-team, SSO, audit log |
| Support | Discord | Email, 1-business-day | Slack, dedicated engineer, SLA |
Frequently asked questions
Get started
Change one environment variable. Watch the bill drop.
Free tier covers a weekend project. Pro is per-token, $20/mo minimum. No credit card needed for Free.