Most AI agents are a clever model on a loose leash. BREAKER is the leash, the rails, and the map β it lets the AI do real work end-to-end, but never lets it do anything that isn't checked, signed, and reversible.
Everyone can use the same AI models β the model is the easy, commodity part. The hard, valuable part is everything wrapped around it: the rules, the memory, the permissions, the recovery. That wrapper is called the harness.
~98%of a top coding agent (Claude Code, teardown) is harness β only ~2% is the AI itself
So the smartest move isn't a better model β it's a better harness. BREAKER is Fogbreak's harness, and it's where our durable advantage lives. It's built from six interlocking parts (we call them planes) β and every one of them sits on top of tech we already have running: Kestra, Omnigent, the Leash, and our knowledge layers.
Grounded in: a Claude Code v2.1.88 teardown Β· arXiv:2605.18747 "Code as Agent Harness" Β· 4-model research (GLM Β· GPT-5.5 Β· Gemini Β· Perplexity).
The six planes (tap any card β the fog clears to show the deep version)
1The Blueprint
π Manifest
One document describes a whole work-area (a Shore): its steps, who's allowed to do what, which AI agents run it, and where it pauses for you. From that single blueprint, BREAKER builds both the steady assembly-line and the AI workers β so the two can never drift apart.
How it actually works
One BreakerManifest (YAML) compiles to: Kestra FlowDefinitions (the deterministic rail) + Omnigent AgentSpecs (the AI council) + Leash gate policy + bounded op-vocab + knowledge bindings.
Single source of truth β the harness is generated, not hand-wired (today we hand-author Kestra YAML and agent configs separately β they drift).
The Manifest is the search space for self-improvement (Plane 6 can propose Manifest patches).
Why it's novel
No vendor compiles one domain blueprint into both the workflow engine and the agents. (Amazon's Kiro does specβtasksβcode β close, but not both rails + agents, and not receipted.)
Two hands work together: a steady hand (the workflow engine) that does the same thing every time, and a creative hand (the AI) that figures things out. Neither is the boss. Every move passes through four safety gates.
The two co-equal layers
Kestra DAG (deterministic rail: stages, If, LoopUntil, Pause) β Omnigent agent loop (AI: AIAgent, council, swap-model). The DAG isn't a wrapper around the agent; the agent isn't a task in the DAG β they're co-equal and both emit receipts.
The four boundaries
Tool β 6-phase ALLOW / ASK / DENY + a bounded op-vocabulary: the AI may only emit ops from a fixed list; off-vocab β rejected (could_not_do). Constrains the action space, not just output shape.
Every meaningful action gets a tamper-proof, signed receipt, chained to the one before it β a notarized logbook nothing can be secretly edited out of. If the chain ever breaks, BREAKER stops. This is the "Boundary-Regulated Execution": the AI can only move forward through signed, checked steps.
How it actually works
Every state transition (not just every tool call) β a hash-chained, ECDSA-P256-signed receipt, cross-linking Kestra executionId β Omnigent conv_id.
Receipt-gated: downstream consumers verify the chain segment before reading new state; a broken chain fail-closes the harness.
Receipts are typed β task-execution vs evolution:* β so an auditor can answer "why did BREAKER change?"
Status
β The verifier is already proven (exec-plane/leash/verify.py, CI-gated, tamper fixtures fail-closed). π¨ The one piece to build: live runtime emission from Kestra/Omnigent β the single highest-leverage unblock.
Why it's novel
Durable-execution rivals (Temporal/DBOS/Restate) have logs β but mutable ones. None chain + sign across the engineβagent boundary. This is BREAKER's sharpest edge.
4The Navigator Β· "AKER"
π§ Knowledge Router
Instead of one dumb search, BREAKER picks the right way to find each answer β and remembers which routes worked (well-worn paths glow brighter). The AI's memory lives outside its short-term window and is pulled in only when needed.
grep / FastContextexact symbol Β· recent file
BGE-M3 + Qdrantfuzzy / meaning
GraphRAG"how does X relate to Y"
Graphiti"what changed / when"
OKF filescanonical definitions
stigmergywell-worn paths brighten
Two kinds of routing
Knowledge-engine routing β a deterministic classifier sends each query to the engine that wins for that query shape (table above).
Harness routing β per task, route to the best agent backend (Claude SDK / Codex / Gemini / OpenHandsβ¦) on cost Β· risk Β· fit Β· prior success. We can be vendor-neutral; no vendor will route to its competitors.
Why it matters
Research finding (arXiv:2605.15184, "Is Grep All You Need?"): the harness, not the retriever, decides accuracy β so the routing decision is the thing to own, govern, and make learnable. The router itself is receipted + reinforced by what evals well.
5One Pen Β· Many Eyes
βοΈ Convergence
Many AI agents can read and argue, but only one is allowed to actually write β so they can't trip over each other. And when two agents genuinely disagree, BREAKER doesn't just pick one: it runs both as a real experiment and learns which was right.
How it actually works
Council agents share one Loro CRDT state object + Leash correlation IDs so they converge instead of diverging. Pattern: scout β plan β verify β ONE writer β merge gate (matches Cognition's "writes stay single-threaded" lesson).
Divergence = fuel (your directive): a detector flags material disagreement β forks parallel receipted A/B branches β a deterministic oracle (tests/eval, never the AI judging itself) picks the winner β winner merges, loser is kept as a learning, and the result reinforces routing.
6It Gets Better Β· Safely
π± Evolution
BREAKER improves itself over time β better prompts, routing, skills β but every change is tested against a fixed yardstick, signed, and reversible. It can learn, but it can't secretly rewrite itself.
The self-improvement loop
propose β oracle-evaluate β canary β promote / roll-back. Every change is a falsifiable prediction + a typed evolution:* receipt. The optimizer can never edit its own oracle.
Three change tiers (the safety gradient)
Auto after eval: prompt wording, route weights, memory weights.
Canary + auto-rollback: agent configs, skill patches, DAG logic.
Human-approved: new tools, new spend, new op-vocab verbs, oracle changes.
Start at harness level (safe); gate code self-modification behind sandbox + holdout evals + human review.
Where BREAKER sits in the real world
There's already a category of tools that give AI "durable rails" β they survive crashes and replay reliably. Our engine Kestra is one of them. BREAKER is the member of that category that is also cryptographically governed, knowledge-native, and self-improving.
π« Fogbreak Β· the BREAKER harness β Boundary-Regulated Execution & Agent Knowledge Engine Routing
Phase-0 grounded research (2026-06-23). Co-designed with GLM-5.2 Β· GPT-5.5 Β· Gemini Β· Perplexity, off the Anthropic meter. a harness that learns β but cannot secretly mutate.