How AgentCTX Works
Follow one agent's journey — from starting empty to full-stack compression — in 7 interactive steps.
Starting Empty
Every session, an agent starts with nothing.
No memory of yesterday. No recall of decisions made. The agent is handed 40+ tool descriptions it may never use — burning nearly 10,000 tokens before it can even begin to reason. On smaller models (128K), this overhead triggers context overflow within 30 turns. On frontier models (200K–1M), the extra room just means more space for the agent to lose important details in the noise — the "lost in the middle" problem gets worse, not better.
The Agent Speaks CTX
What if it could ask in 10 tokens instead of 10,000?
CTX is a structured query language designed for how agents actually think. Instead of verbose JSON-RPC payloads, the agent writes concise operations — each one a thought and an action fused into a single expression.
{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "search_issues",
"arguments": {
"query": "state:open is:issue",
"limit": 10
}
},
"id": 1
}?t github.search_issues "state:open is:issue" ^10
The Gateway Routes
One gateway. Seven planes. The agent doesn't need to know which backend handles what.
The CTX Gateway is a zero-trust router. It parses the agent's statement, identifies the target plane, runs it through 8-layer security middleware, and dispatches to the right backend — all in one hop. Bigger context windows don't solve this: routing 70 tool descriptions through a 1M-token window still costs the same per-token, and the agent still has to parse them all.
Memory That Persists
The agent remembers — across sessions, across days.
CTX memory is a 5-layer cognitive architecture: from split-second perception up to long-term procedural knowledge. Every memory is graph-structured, tag-searchable, and recalled in sub-millisecond time — flat to 50,000 entries. This is what provider compaction can't do: it summarizes away the details, while CTX stores them externally and recalls exactly what the agent needs, when it needs it.
+m "lesson" #architecture "singleton pattern breaks integration tests"
?m "architecture" @7d ^3
The Sidecar Translates
Compiled, not interpreted. Then cryptographically signed.
The sidecar is a deterministic compiler — no LLM in the translation path. It takes every CTX operation and compiles it into 7 target formats, including human-readable English. Every translation is Ed25519 signed, creating an immutable audit trail.
?m "arch" @7d ^3The agent recalled 3 architecture-related memories from the last 7 days.
Full-Stack Compression
Not just input. Every direction.
CTX compresses in all directions: input tokens (tool descriptions), output tokens (agent writes), response tokens (data returned), and delegation tokens (agent-to-agent handoffs). A traditional prose handoff costs 2,000 tokens — with CTX delegation, it's 15. Whether your model has 128K or 1M tokens, every token costs money. Larger windows mean larger bills.
Agents Talk to Agents
One agent is useful. A network of agents with a shared language is transformative.
When agents coordinate through CTX, a handoff is 15 tokens instead of 2,000. The memory bus means one agent writes what it learned and the next reads it instantly. Federation means agents across organizations can query each other over mTLS.
// Prose handoff: ~2,000 tokens
{
"type": "handoff",
"from": "agent-alpha",
"context": "I have completed the parser implementation
including all edge cases for nested pipes. The SSE
transport layer needs wiring next. Authentication
is handled via OIDC auto-discovery. Note that the
singleton pattern caused test failures, switched to
factory pattern. See files: src/parser/...",
// ... 1,900 more tokens of context
}+m "handoff" #state "parser done, SSE needs wiring"
Ready to give your agents a brain?
Install AgentCTX in 30 seconds. Your agents start saving tokens immediately — no behavior changes required.