Formal Specification
Formal Specification
Section titled “Formal Specification”This page defines the formal grammar, parsing rules, error handling, and AST representation of CTX.
EBNF Grammar
Section titled “EBNF Grammar”statement = operator , plane , [ ":" , verb ] , [ whitespace , decorator ] , [ whitespace , target ] , { whitespace , filter } , [ whitespace , payload ] ;
operator = "?" | "!" | ">" | "+" | "~" | "-" | "^" ;
plane = "k" | "t" | "s" | "m" | "a" | "i" | "l" ;
verb = "clarify" | "escalate" | "partial" | "handoff" | "check" | "edge" | "cross_ref" | "pattern" | "scope" | "apply" | "context" | "task" | "exit" | "example" | "chat" | "complete" | "embed" ;
target = quoted_string | dotted_identifier | simple_identifier ;
dotted_identifier = identifier , { "." , identifier } ;simple_identifier = /[^\s#@^*&"]+/ ;identifier = /\w+/ ;quoted_string = '"' , { char | escape } , '"' ;escape = '\"' | '\\' ;
filter = tag_filter | time_filter | limit_filter | project_filter | format_filter | depth_filter ;
tag_filter = "#" , /\S+/ ;time_filter = "@" , ( duration | iso_date ) ;limit_filter = "^" , /\d+/ ;depth_filter = "^depth=" , /\d+/ ;project_filter = "*" , /\S+/ ;format_filter = "&" , /\S+/ ;
duration = /\d+[smhdwMy]/ ;iso_date = /\d{4}-\d{2}-\d{2}(T\d{2}:\d{2}:\d{2}Z?)?/ ;
payload = quoted_string | kv_pairs ;kv_pairs = kv_pair , { whitespace , kv_pair } ;kv_pair = key , "=" , ( quoted_string | /[^\s]+/ ) ;key = /\w+/ ;
decorator = "@" , decorator_name , [ decorator_args ] ;decorator_name = "similar" | "relate" | "traverse" | "changes" | "promote" | "grounded" | "decay" ;Parsing Rules
Section titled “Parsing Rules”Input Constraints
Section titled “Input Constraints”| Constraint | Value | Error Code |
|---|---|---|
| Minimum length | 2 characters | INPUT_TOO_SHORT |
| Maximum length | 65,536 bytes | INPUT_TOO_LARGE |
| Maximum nesting depth | 16 levels | DEPTH_EXCEEDED |
| Empty input | Rejected | EMPTY_INPUT |
Normalization
Section titled “Normalization”Before parsing, input undergoes NFC (Unicode Canonical Decomposition followed by Canonical Composition) normalization via normalizeCtxInput(). This ensures consistent handling of combining characters and diacritics.
Parsing Sequence
Section titled “Parsing Sequence”- Normalize — NFC normalization + trim
- Size check — reject if empty, too short, or too large
- Extract operator — first character (position 0)
- Extract plane — second character (position 1)
- Extract verb — if character at position 2 is
:, parse until whitespace - Validate op×plane — check against validity matrix (unless verb present)
- Parse decorator — if
@followed by a known decorator name - Parse target — quoted string or unquoted identifier
- Parse filters — consume
#,@,^,*,&prefixed tokens - Parse payload — remaining text as quoted string or key-value pairs
Key-Value Pair Security
Section titled “Key-Value Pair Security”The parser rejects prototype-polluting keys:
__proto__ → BANNED_KEY errorconstructor → BANNED_KEY errorprototype → BANNED_KEY errorError Codes
Section titled “Error Codes”All parse errors produce a typed CTXParseErrorData object:
| Code | Description |
|---|---|
PARSE_ERROR | General parsing failure |
EMPTY_INPUT | Input is empty after trimming |
INPUT_TOO_SHORT | Input is less than 2 characters |
INPUT_TOO_LARGE | Input exceeds 65,536 bytes |
INVALID_OPERATOR | First character is not a valid operator |
INVALID_PLANE | Second character is not a valid plane |
INVALID_OP_PLANE | Operator not valid on this plane (e.g., >k) |
INVALID_VERB | Unknown verb after : separator |
VERB_NOT_ALLOWED | Verb not allowed on this operator-plane combo |
UNTERMINATED_QUOTE | Quoted string missing closing " |
MISSING_TARGET | Required target not provided |
DEPTH_EXCEEDED | Nesting depth exceeds MAX_DEPTH (16) |
BANNED_KEY | Key-value pair uses a banned property name |
MIXED_SCRIPT | Tag value mixes Unicode scripts (homograph detection) |
Error Structure
Section titled “Error Structure”interface CTXParseErrorData { code: CTXErrorCode; // Machine-switchable code position?: number; // Character position in input detail: string; // Human-readable description}Errors are thrown as CTXParseError instances (extending Error) with a .data property containing the structured error.
AST Structure
Section titled “AST Structure”The parser produces a CTXStatement AST node:
interface CTXStatement { operator: CTXOperator; // "?" | "!" | ">" | "+" | "~" | "-" | "^" plane: CTXPlane; // "k" | "t" | "s" | "m" | "a" | "i" | "l" verb?: CTXVerb; // Optional pipeline verb decorator?: CTXDecorator; // Optional decorator (@similar, @traverse, etc.) target: string; // The identifier or search query filters: CTXFilter[]; // Array of typed filters payload?: CTXPayload; // Optional key-value pairs or quoted string raw: string; // Original normalized input}Filter Types
Section titled “Filter Types”type CTXFilterType = "tag" | "time" | "limit" | "project" | "format" | "depth" | "layer" | "kind" | "live" | "diff";
interface CTXFilter { type: CTXFilterType; value: string; mixedScript?: boolean; // True if homograph detected}Decorator Types
Section titled “Decorator Types”type CTXDecorator = | { type: "similar"; targetId: string; k: number } | { type: "relate"; source: string; edge: string; target: string } | { type: "traverse"; targetId: string; depth: number } | { type: "changes"; since: string } | { type: "promote"; targetId: string; layer: string } | { type: "grounded"; memoryId: string } | { type: "decay"; threshold: number };Rust Parser
Section titled “Rust Parser”A native Rust implementation (crates/ctx-parser/) provides identical parsing behavior with multi-target compilation:
- Native —
cargo build - NAPI (Node.js) —
napi-rsbindings - PyO3 (Python) — Python bindings
- WASM (browser) —
wasm-bindgenfor web
The Rust parser passes the same 75 test cases as the TypeScript parser, ensuring byte-for-byte compatibility.
- Syntax Overview → — back to the basics
- Examples → — see the grammar in action