Name: Thing Event System (TES)
Author: Pentatonic

Question 1

I'm on a Claude Code, Cursor, or Codex subscription — does the per-token discount apply to me?

Accepted Answer

No, and the limitation is on the upstream's side, not ours. Anthropic and OpenAI bind subscription OAuth tokens to the originating client; routing them through any third-party proxy returns a 401 at the upstream's edge. Your subscription bill stays the same. What we do offer subscription users is the Claude Code plugin — persistent memory across sessions, free for personal use, free at the team level until you exceed 1M proxied input tokens per month (which the plugin itself doesn't consume).

Question 2

How does billing work for the proxy?

Accepted Answer

Free tier is 1M proxied input tokens per month, no credit card. Pro is $0.50 per 1M input tokens proxied, with a $20/mo minimum — same per-token shape as Anthropic or OpenAI, at roughly a sixth the rate ($3 / 1M for Sonnet input direct). Enterprise is $0.30 per 1M at volume on an annual commit. We only meter input tokens we actually forward upstream — the tokens compression saved don't appear on the invoice.

Question 3

Why per-token instead of a flat rate or savings share?

Accepted Answer

A flat rate caps your upside when you scale into us. A savings share ends in renegotiation and audit fights over what you 'would have' spent. Per-token is auditable: put our invoice next to your direct upstream invoice and the gap is the thing. You pay for what you send through us.

Question 4

Do you charge for tokens the compression saved?

Accepted Answer

No. You're only billed for input tokens actually sent upstream through us. The tokens avoided by the preamble injection don't appear on our invoice.

Question 5

What happens if I exceed the Free tier?

Accepted Answer

You can sign up for Pro from the dashboard — same API key, same env var. Or stay on Free and we'll rate-limit politely until the next month rolls over.

Question 6

Can I opt out of compression per request?

Accepted Answer

Yes. Send the header X-TES-Mode: passthrough on any request and we forward to the upstream provider untouched. Useful for A/B'ing the bill, or when you want to force-rerun without the cached preamble.

Question 7

Which providers are supported?

Accepted Answer

The deployed proxy at llm.api.pentatonic.com routes /v1/messages to api.anthropic.com and /v1/chat/completions to api.openai.com. Other OpenAI-compatible upstreams (MiniMax, vLLM, llama.cpp, your own inference) are available on the Enterprise tier as dedicated upstream routing — same wire shape, different forwarding URL.

Question 8

Where is my data handled?

Accepted Answer

TES runs on Cloudflare's global edge network. We don't read or train on your responses. Enterprise customers can specify data residency, including EU-only processing, and route through dedicated regions.

Question 9

Can I cancel at any time?

Accepted Answer

Yes on Free and Pro — month-to-month, no minimum commitment. Enterprise is an annual commit. Export your request log and token-savings data at any time from the dashboard.

Feature	Free	Pro	Enterprise
Free input tokens / month	1,000,000	Metered from $0	Per contract
Per-token rate	—	$0.50 / 1M input	$0.30 / 1M input at volume
Monthly minimum	None	$20/mo	$1,000/mo
Commitment	None	Month-to-month	Annual commit
Proxy upstreams	Anthropic Messages + OpenAI Chat Completions	Same	+ MiniMax, vLLM, llama.cpp, your own inference
Claude Code plugin (memory)	Included — unlimited	Included + per-project breakdown	Included + team-shared memory, SSO
Retrieval source	Bring your own (URLs, files)	+ persistent memory, per-project	+ custom KGs, vector stores, private corpora
Latency target	Best-effort	Best-effort + monitoring	SLA, dedicated regions
Dashboard	Savings + request log	+ per-project, CSV export	+ per-team, SSO, audit log
Support	Discord	Email, 1-business-day	Slack, dedicated engineer, SLA

Per-token. Lower than upstream.

What would your bill look like on TES?

Per-token. Lower than upstream. Audit by comparing two invoices.

Free

Pro

Enterprise

Compare plans

Frequently asked questions

How it works

Benchmarks

Docs

Change one environment variable. Watch the bill drop.