Skip to content

Sentinel Monitoring

The Sentinel is AgentCTX’s runtime alignment enforcement system. It monitors agent behavior in real-time and flags deviations from expected patterns — before they cause problems.

Tracks whether an agent’s actions align with its stated goals:

Agent says: +m:task "implement auth module" #todo
Agent does: >t filesystem.delete path="/etc/passwd"
Sentinel: ⚠️ DIVERGENCE — action contradicts stated task

The sentinel compares the agent’s declared context (stored via pipeline verbs) against its actual operations. A significant mismatch triggers an alignment alert.

Validates that agent claims are grounded in actual memory or knowledge:

Agent claims: "Based on our previous discussion..."
Sentinel: ?m @grounded "previous discussion" → no matching memory found
⚠️ UNGROUNDED — claim not backed by stored context

This catches confabulation — when an agent fabricates references to conversations or decisions that don’t exist in the memory plane.

Detects when an agent’s behavior deviates from established performance baselines:

  • Performance baseline tracking — monitors agent response quality against established baselines
  • Capability regression — flags when an agent suddenly can’t do things it previously demonstrated
  • Response degradation — detects quality drops without environmental change
  • Selective compliance — catches when an agent follows some instructions but ignores others

The sentinel runs as alignment middleware in the gateway pipeline (position 9 of 15):

... → Scope Enforce → Alignment (Sentinel) → Hydration → Router → ...

It inspects the RequestContext before the request reaches the router, and can:

  • Allow — normal operation continues
  • Flag — operation proceeds with a warning annotation
  • Block — operation is rejected with an alignment error
sentinel/
├── index.ts # Barrel export (re-exports all detectors)
├── divergence.ts # DivergenceDetector — action vs. stated goals
├── grounded.ts # GroundednessEnforcer — claim validation
├── consistency.ts # ConsistencyDetector — performance baseline tracking
└── types.ts # Alert types, thresholds, PerformanceBaseline
AlertSeverityDescription
DIVERGENCEHighAgent actions don’t match stated goals
UNGROUNDEDMediumClaims not backed by stored context
CONSISTENCYHighPerformance deviates from established baseline
ALIGNMENT_DRIFTLowGradual deviation from expected patterns

Sentinel thresholds are configurable per deployment:

  • Divergence threshold — how much deviation is acceptable before flagging
  • Groundedness strictness — whether all claims must be backed, or only high-confidence ones
  • Monitoring modepassive (log only), active (flag + annotate), or strict (block on violation)

The sentinel leverages cognition traces (pipeline verbs) as ground truth for what an agent intends to do:

+m:context "reviewing PR for security" ← agent's stated intent
+m:check "checking OWASP compliance" ← declared action
>t code-analysis.scan target="auth.ts" ← actual operation ✅ matches

Because pipeline verbs are stored in the memory plane, the sentinel can build a timeline of agent intent and compare it against actual behavior.