Formal Specification

This page defines the formal grammar, parsing rules, error handling, and AST representation of CTX.

EBNF Grammar

statement     = operator , plane , [ ":" , verb ] , [ whitespace , decorator ] ,
                [ whitespace , target ] , { whitespace , filter } ,
                [ whitespace , payload ] ;

operator      = "?" | "!" | ">" | "+" | "~" | "-" | "^" ;

plane         = "k" | "t" | "s" | "m" | "a" | "i" | "l" ;

verb          = "clarify" | "escalate" | "partial" | "handoff"
              | "check" | "edge" | "cross_ref" | "pattern" | "scope" | "apply"
              | "context" | "task" | "exit" | "example"
              | "chat" | "complete" | "embed" ;

target        = quoted_string | dotted_identifier | simple_identifier ;

dotted_identifier = identifier , { "." , identifier } ;
simple_identifier = /[^\s#@^*&"]+/ ;
identifier        = /\w+/ ;
quoted_string     = '"' , { char | escape } , '"' ;
escape            = '\"' | '\\' ;

filter        = tag_filter | time_filter | limit_filter
              | project_filter | format_filter | depth_filter ;

tag_filter     = "#" , /\S+/ ;
time_filter    = "@" , ( duration | iso_date ) ;
limit_filter   = "^" , /\d+/ ;
depth_filter   = "^depth=" , /\d+/ ;
project_filter = "*" , /\S+/ ;
format_filter  = "&" , /\S+/ ;

duration       = /\d+[smhdwMy]/ ;
iso_date       = /\d{4}-\d{2}-\d{2}(T\d{2}:\d{2}:\d{2}Z?)?/ ;

payload        = quoted_string | kv_pairs ;
kv_pairs       = kv_pair , { whitespace , kv_pair } ;
kv_pair        = key , "=" , ( quoted_string | /[^\s]+/ ) ;
key            = /\w+/ ;

decorator      = "@" , decorator_name , [ decorator_args ] ;
decorator_name = "similar" | "relate" | "traverse"
               | "changes" | "promote" | "grounded" | "decay" ;

Parsing Rules

Input Constraints

Constraint	Value	Error Code
Minimum length	2 characters	`INPUT_TOO_SHORT`
Maximum length	65,536 bytes	`INPUT_TOO_LARGE`
Maximum nesting depth	16 levels	`DEPTH_EXCEEDED`
Empty input	Rejected	`EMPTY_INPUT`

Normalization

Before parsing, input undergoes NFC (Unicode Canonical Decomposition followed by Canonical Composition) normalization via normalizeCtxInput(). This ensures consistent handling of combining characters and diacritics.

Parsing Sequence

Normalize — NFC normalization + trim
Size check — reject if empty, too short, or too large
Extract operator — first character (position 0)
Extract plane — second character (position 1)
Extract verb — if character at position 2 is :, parse until whitespace
Validate op×plane — check against validity matrix (unless verb present)
Parse decorator — if @ followed by a known decorator name
Parse target — quoted string or unquoted identifier
Parse filters — consume #, @, ^, *, & prefixed tokens
Parse payload — remaining text as quoted string or key-value pairs

Key-Value Pair Security

The parser rejects prototype-polluting keys:

__proto__     → BANNED_KEY error
constructor   → BANNED_KEY error
prototype     → BANNED_KEY error

Error Codes

All parse errors produce a typed CTXParseErrorData object:

Code	Description
`PARSE_ERROR`	General parsing failure
`EMPTY_INPUT`	Input is empty after trimming
`INPUT_TOO_SHORT`	Input is less than 2 characters
`INPUT_TOO_LARGE`	Input exceeds 65,536 bytes
`INVALID_OPERATOR`	First character is not a valid operator
`INVALID_PLANE`	Second character is not a valid plane
`INVALID_OP_PLANE`	Operator not valid on this plane (e.g., `>k`)
`INVALID_VERB`	Unknown verb after `:` separator
`VERB_NOT_ALLOWED`	Verb not allowed on this operator-plane combo
`UNTERMINATED_QUOTE`	Quoted string missing closing `"`
`MISSING_TARGET`	Required target not provided
`DEPTH_EXCEEDED`	Nesting depth exceeds `MAX_DEPTH` (16)
`BANNED_KEY`	Key-value pair uses a banned property name
`MIXED_SCRIPT`	Tag value mixes Unicode scripts (homograph detection)

Error Structure

interface CTXParseErrorData {
  code: CTXErrorCode;      // Machine-switchable code
  position?: number;       // Character position in input
  detail: string;          // Human-readable description
}

Errors are thrown as CTXParseError instances (extending Error) with a .data property containing the structured error.

AST Structure

The parser produces a CTXStatement AST node:

interface CTXStatement {
  operator: CTXOperator;           // "?" | "!" | ">" | "+" | "~" | "-" | "^"
  plane: CTXPlane;                 // "k" | "t" | "s" | "m" | "a" | "i" | "l"
  verb?: CTXVerb;                  // Optional pipeline verb
  decorator?: CTXDecorator;        // Optional decorator (@similar, @traverse, etc.)
  target: string;                  // The identifier or search query
  filters: CTXFilter[];            // Array of typed filters
  payload?: CTXPayload;            // Optional key-value pairs or quoted string
  raw: string;                     // Original normalized input
}

Filter Types

type CTXFilterType = "tag" | "time" | "limit" | "project"
                   | "format" | "depth" | "layer" | "kind"
                   | "live" | "diff";

interface CTXFilter {
  type: CTXFilterType;
  value: string;
  mixedScript?: boolean;           // True if homograph detected
}

Decorator Types

type CTXDecorator =
  | { type: "similar";  targetId: string; k: number }
  | { type: "relate";   source: string; edge: string; target: string }
  | { type: "traverse"; targetId: string; depth: number }
  | { type: "changes";  since: string }
  | { type: "promote";  targetId: string; layer: string }
  | { type: "grounded"; memoryId: string }
  | { type: "decay";    threshold: number };

Rust Parser

A native Rust implementation (crates/ctx-parser/) provides identical parsing behavior with multi-target compilation:

Native — cargo build
NAPI (Node.js) — napi-rs bindings
PyO3 (Python) — Python bindings
WASM (browser) — wasm-bindgen for web

The Rust parser passes the same 75 test cases as the TypeScript parser, ensuring byte-for-byte compatibility.

Syntax Overview → — back to the basics
Examples → — see the grammar in action

Formal Specification

Formal Specification

EBNF Grammar

Parsing Rules

Input Constraints

Normalization

Parsing Sequence

Key-Value Pair Security

Error Codes

Error Structure

AST Structure

Filter Types

Decorator Types

Rust Parser

Next