Skip to content

Formal Specification

This page defines the formal grammar, parsing rules, error handling, and AST representation of CTX.

statement = operator , plane , [ ":" , verb ] , [ whitespace , decorator ] ,
[ whitespace , target ] , { whitespace , filter } ,
[ whitespace , payload ] ;
operator = "?" | "!" | ">" | "+" | "~" | "-" | "^" ;
plane = "k" | "t" | "s" | "m" | "a" | "i" | "l" ;
verb = "clarify" | "escalate" | "partial" | "handoff"
| "check" | "edge" | "cross_ref" | "pattern" | "scope" | "apply"
| "context" | "task" | "exit" | "example"
| "chat" | "complete" | "embed" ;
target = quoted_string | dotted_identifier | simple_identifier ;
dotted_identifier = identifier , { "." , identifier } ;
simple_identifier = /[^\s#@^*&"]+/ ;
identifier = /\w+/ ;
quoted_string = '"' , { char | escape } , '"' ;
escape = '\"' | '\\' ;
filter = tag_filter | time_filter | limit_filter
| project_filter | format_filter | depth_filter ;
tag_filter = "#" , /\S+/ ;
time_filter = "@" , ( duration | iso_date ) ;
limit_filter = "^" , /\d+/ ;
depth_filter = "^depth=" , /\d+/ ;
project_filter = "*" , /\S+/ ;
format_filter = "&" , /\S+/ ;
duration = /\d+[smhdwMy]/ ;
iso_date = /\d{4}-\d{2}-\d{2}(T\d{2}:\d{2}:\d{2}Z?)?/ ;
payload = quoted_string | kv_pairs ;
kv_pairs = kv_pair , { whitespace , kv_pair } ;
kv_pair = key , "=" , ( quoted_string | /[^\s]+/ ) ;
key = /\w+/ ;
decorator = "@" , decorator_name , [ decorator_args ] ;
decorator_name = "similar" | "relate" | "traverse"
| "changes" | "promote" | "grounded" | "decay" ;
ConstraintValueError Code
Minimum length2 charactersINPUT_TOO_SHORT
Maximum length65,536 bytesINPUT_TOO_LARGE
Maximum nesting depth16 levelsDEPTH_EXCEEDED
Empty inputRejectedEMPTY_INPUT

Before parsing, input undergoes NFC (Unicode Canonical Decomposition followed by Canonical Composition) normalization via normalizeCtxInput(). This ensures consistent handling of combining characters and diacritics.

  1. Normalize — NFC normalization + trim
  2. Size check — reject if empty, too short, or too large
  3. Extract operator — first character (position 0)
  4. Extract plane — second character (position 1)
  5. Extract verb — if character at position 2 is :, parse until whitespace
  6. Validate op×plane — check against validity matrix (unless verb present)
  7. Parse decorator — if @ followed by a known decorator name
  8. Parse target — quoted string or unquoted identifier
  9. Parse filters — consume #, @, ^, *, & prefixed tokens
  10. Parse payload — remaining text as quoted string or key-value pairs

The parser rejects prototype-polluting keys:

__proto__ → BANNED_KEY error
constructor → BANNED_KEY error
prototype → BANNED_KEY error

All parse errors produce a typed CTXParseErrorData object:

CodeDescription
PARSE_ERRORGeneral parsing failure
EMPTY_INPUTInput is empty after trimming
INPUT_TOO_SHORTInput is less than 2 characters
INPUT_TOO_LARGEInput exceeds 65,536 bytes
INVALID_OPERATORFirst character is not a valid operator
INVALID_PLANESecond character is not a valid plane
INVALID_OP_PLANEOperator not valid on this plane (e.g., >k)
INVALID_VERBUnknown verb after : separator
VERB_NOT_ALLOWEDVerb not allowed on this operator-plane combo
UNTERMINATED_QUOTEQuoted string missing closing "
MISSING_TARGETRequired target not provided
DEPTH_EXCEEDEDNesting depth exceeds MAX_DEPTH (16)
BANNED_KEYKey-value pair uses a banned property name
MIXED_SCRIPTTag value mixes Unicode scripts (homograph detection)
interface CTXParseErrorData {
code: CTXErrorCode; // Machine-switchable code
position?: number; // Character position in input
detail: string; // Human-readable description
}

Errors are thrown as CTXParseError instances (extending Error) with a .data property containing the structured error.

The parser produces a CTXStatement AST node:

interface CTXStatement {
operator: CTXOperator; // "?" | "!" | ">" | "+" | "~" | "-" | "^"
plane: CTXPlane; // "k" | "t" | "s" | "m" | "a" | "i" | "l"
verb?: CTXVerb; // Optional pipeline verb
decorator?: CTXDecorator; // Optional decorator (@similar, @traverse, etc.)
target: string; // The identifier or search query
filters: CTXFilter[]; // Array of typed filters
payload?: CTXPayload; // Optional key-value pairs or quoted string
raw: string; // Original normalized input
}
type CTXFilterType = "tag" | "time" | "limit" | "project"
| "format" | "depth" | "layer" | "kind"
| "live" | "diff";
interface CTXFilter {
type: CTXFilterType;
value: string;
mixedScript?: boolean; // True if homograph detected
}
type CTXDecorator =
| { type: "similar"; targetId: string; k: number }
| { type: "relate"; source: string; edge: string; target: string }
| { type: "traverse"; targetId: string; depth: number }
| { type: "changes"; since: string }
| { type: "promote"; targetId: string; layer: string }
| { type: "grounded"; memoryId: string }
| { type: "decay"; threshold: number };

A native Rust implementation (crates/ctx-parser/) provides identical parsing behavior with multi-target compilation:

  • Nativecargo build
  • NAPI (Node.js) — napi-rs bindings
  • PyO3 (Python) — Python bindings
  • WASM (browser) — wasm-bindgen for web

The Rust parser passes the same 75 test cases as the TypeScript parser, ensuring byte-for-byte compatibility.