How AgentCTX Works

Follow one agent's journey — from starting empty to full-stack compression — in 7 interactive steps.

AgentCTXIdentityGatewayOrchestratorSidecar7 PlanesAuditSurrealDB
1Step 1 of 7

Starting Empty

Every session, an agent starts with nothing.

No memory of yesterday. No recall of decisions made. The agent is handed 40+ tool descriptions it may never use — burning nearly 10,000 tokens before it can even begin to reason. On smaller models (128K), this overhead triggers context overflow within 30 turns. On frontier models (200K–1M), the extra room just means more space for the agent to lose important details in the noise — the "lost in the middle" problem gets worse, not better.

Turn 0 of 100
No Optimization
200,000
100% — context overflow
Provider Compaction
150,000
75% — lossy compaction loop
With CTX
8,000
4% — CTX-native compression
Your 200K context after 0 turns
* CTX-native: all tool calls, responses, and memory encoded in CTX with external memory via SurrealDB
9,950
tokens wasted on tool descriptions per session
100%
context overflow after extended agentic sessions
6,500
tokens to rebuild context from scratch
2Step 2 of 7

The Agent Speaks CTX

What if it could ask in 10 tokens instead of 10,000?

CTX is a structured query language designed for how agents actually think. Instead of verbose JSON-RPC payloads, the agent writes concise operations — each one a thought and an action fused into a single expression.

Standard MCP (46 tokens)46 tk
{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "search_issues",
    "arguments": {
      "query": "state:open is:issue",
      "limit": 10
    }
  },
  "id": 1
}
CTX (9 tokens)9 tk
?t github.search_issues "state:open is:issue" ^10
76%
fewer output tokens per operation
7
operators × 7 planes = entire API
3Step 3 of 7

The Gateway Routes

One gateway. Seven planes. The agent doesn't need to know which backend handles what.

The CTX Gateway is a zero-trust router. It parses the agent's statement, identifies the target plane, runs it through 8-layer security middleware, and dispatches to the right backend — all in one hop. Bigger context windows don't solve this: routing 70 tool descriptions through a 1M-token window still costs the same per-token, and the agent still has to parse them all.

?k "auth middleware" #code ^3
1
gateway endpoint instead of 70 tool descriptions
8
security middleware layers per request
4Step 4 of 7

Memory That Persists

The agent remembers — across sessions, across days.

CTX memory is a 5-layer cognitive architecture: from split-second perception up to long-term procedural knowledge. Every memory is graph-structured, tag-searchable, and recalled in sub-millisecond time — flat to 50,000 entries. This is what provider compaction can't do: it summarizes away the details, while CTX stores them externally and recalls exactly what the agent needs, when it needs it.

Agent stores a memory10 tk
+m "lesson" #architecture "singleton pattern breaks integration tests"
Agent recalls it days later — 3 results, <1ms6 tk
?m "architecture" @7d ^3
0.78ms
memory recall, flat to 50K entries
5
cognitive layers (perception → procedural)
97%
fewer tokens for session resume
5Step 5 of 7

The Sidecar Translates

Compiled, not interpreted. Then cryptographically signed.

The sidecar is a deterministic compiler — no LLM in the translation path. It takes every CTX operation and compiles it into 7 target formats, including human-readable English. Every translation is Ed25519 signed, creating an immutable audit trail.

CTX Input?m "arch" @7d ^3
English
The agent recalled 3 architecture-related memories from the last 7 days.
7
compilation targets (SurrealQL, SQL, REST, GraphQL, JSON-RPC, English, CTXB)
0
LLMs in the translation path — pure compiler
6Step 6 of 7

Full-Stack Compression

Not just input. Every direction.

CTX compresses in all directions: input tokens (tool descriptions), output tokens (agent writes), response tokens (data returned), and delegation tokens (agent-to-agent handoffs). A traditional prose handoff costs 2,000 tokens — with CTX delegation, it's 15. Whether your model has 128K or 1M tokens, every token costs money. Larger windows mean larger bills.

Input
91%
9,950 → 850 tokens
Output
76%
49 → 12 tokens/call
Response
60%
JSON → CTX encoding
Delegation
99%
2,000 → 15 tokens
91%
input compression (9,950 → 850 tokens)
76%
output compression (49 → 12 tokens/call)
25-60%
response compression via CTX encoding
99%
delegation compression (2,000 → 15 tokens per handoff)
7Step 7 of 7

Agents Talk to Agents

One agent is useful. A network of agents with a shared language is transformative.

When agents coordinate through CTX, a handoff is 15 tokens instead of 2,000. The memory bus means one agent writes what it learned and the next reads it instantly. Federation means agents across organizations can query each other over mTLS.

2,000
Prose handoff
15
CTX handoff
Traditional handoff (~2,000 tokens)2000 tk
// Prose handoff: ~2,000 tokens
{
  "type": "handoff",
  "from": "agent-alpha",
  "context": "I have completed the parser implementation 
    including all edge cases for nested pipes. The SSE 
    transport layer needs wiring next. Authentication 
    is handled via OIDC auto-discovery. Note that the 
    singleton pattern caused test failures, switched to 
    factory pattern. See files: src/parser/...",
  // ... 1,900 more tokens of context
}
CTX handoff (15 tokens)15 tk
+m "handoff" #state "parser done, SSE needs wiring"
15
tokens per agent-to-agent handoff
2,000
tokens for the same handoff in prose
99%
compression on coordination messages

Ready to give your agents a brain?

Install AgentCTX in 30 seconds. Your agents start saving tokens immediately — no behavior changes required.