Most AI agents are a clever model on a loose leash. BREAKER is the leash, the rails, and the map — it lets the AI do real work end-to-end, but never lets it do anything that isn't checked, signed, and reversible.
deterministic rails + AI judgement✅ all 6 planes BUILT — 101 tests · merged
First, the big idea: the harness is the moat
Everyone can use the same AI models — the model is the easy, commodity part. The hard, valuable part is everything wrapped around it: the rules, the memory, the permissions, the recovery. That wrapper is called the harness.
~98%of a top coding agent (Claude Code, teardown) is harness — only ~2% is the AI itself
So the smartest move isn't a better model — it's a better harness. BREAKER is Fogbreak's harness, and it's where our durable advantage lives. It's built from six interlocking parts (we call them planes) — and every one of them sits on top of tech we already have running: Kestra, Omnigent, the Leash, and our knowledge layers.
Grounded in: a Claude Code v2.1.88 teardown · arXiv:2605.18747 "Code as Agent Harness" · 4-model research (GLM · GPT-5.5 · Gemini · Perplexity).
The six planes (tap any card — the fog clears to show the deep version)
1The Blueprint
📐 Manifest
One document describes a whole work-area (a Shore): its steps, who's allowed to do what, which AI agents run it, and where it pauses for you. From that single blueprint, BREAKER builds both the steady assembly-line and the AI workers — so the two can never drift apart.
How it actually works
One BreakerManifest (YAML) compiles to: Kestra FlowDefinitions (the deterministic rail) + Omnigent AgentSpecs (the AI council) + Leash gate policy + bounded op-vocab + knowledge bindings.
Single source of truth → the harness is generated, not hand-wired (today we hand-author Kestra YAML and agent configs separately — they drift).
The Manifest is the search space for self-improvement (Plane 6 can propose Manifest patches).
Why it's novel
No vendor compiles one domain blueprint into both the workflow engine and the agents. (Amazon's Kiro does spec→tasks→code — close, but not both rails + agents, and not receipted.)
Status — ✅ now BUILT
The BreakerManifest compiler is BUILT (services/breaker/src/compiler/): one typed manifest → Kestra flows + Omnigent AgentSpecs + gate policy + op-vocab + knowledge bindings; all-or-nothing, idempotent, cross-artifact consistent, carries no model id/secret. Proven on the real-estate Shore.
Two hands work together: a steady hand (the workflow engine) that does the same thing every time, and a creative hand (the AI) that figures things out. Neither is the boss. Every move passes through four safety gates.
The two co-equal layers
Kestra DAG (deterministic rail: stages, If, LoopUntil, Pause) ⊕ Omnigent agent loop (AI: AIAgent, council, swap-model). The DAG isn't a wrapper around the agent; the agent isn't a task in the DAG — they're co-equal and both emit receipts.
The four boundaries
Tool — 6-phase ALLOW / ASK / DENY + a bounded op-vocabulary: the AI may only emit ops from a fixed list; off-vocab → rejected (could_not_do). Constrains the action space, not just output shape.
Flow — Kestra Pause = human gate that can wait days, resume from any device.
Spend — every model call routes through Ferry (BYOK + metering); the spend event itself becomes a receipt.
Status — ✅ now BUILT
The governed-turn runtime is BUILT (services/breaker/src/execution/, PR #130): runGovernedTurn drives one turn with the Kestra DAG ⊕ Omnigent loop co-equal, emitting one signed Leash receipt per consequential transition into the live D1LeashStore — the live-emission gap is closed — and verifying prior state with gatedRead (fail-closed). All four boundaries are enforced at runtime; agents only propose via a typed Work Envelope, only BREAKER commits; wired to the real OmnigentSSE / KestraRestFlowSource seams (no mocks).
Every meaningful action gets a tamper-proof, signed receipt, chained to the one before it — a notarized logbook nothing can be secretly edited out of. If the chain ever breaks, BREAKER stops. This is the "Boundary-Regulated Execution": the AI can only move forward through signed, checked steps.
How it actually works
Every state transition (not just every tool call) → a hash-chained, ECDSA-P256-signed receipt, cross-linking Kestra executionId ⊕ Omnigent conv_id.
Receipt-gated: downstream consumers verify the chain segment before reading new state; a broken chain fail-closes the harness.
Receipts are typed — task-execution vs evolution:* — so an auditor can answer "why did BREAKER change?"
Status — ✅ now BUILT
✅ The verifier was already proven (exec-plane/leash/verify.py, CI-gated). And now the runtime emission + the fail-closed gate are BUILT (services/breaker/src/leash/): 5 transition types emit signed receipts, gatedRead halts fail-closed on any break, and a TS verifier port is conformance-proven ≡ the Python verifier — both directions (Python accepts a TS-authored chain; TS reproduces Python's verdicts byte-for-byte). The single highest-leverage unblock is closed.
Why it's novel
Durable-execution rivals (Temporal/DBOS/Restate) have logs — but mutable ones. None chain + sign across the engine⊕agent boundary. This is BREAKER's sharpest edge.
4The Navigator · "AKER"
🧭 Knowledge Router
Instead of one dumb search, BREAKER picks the right way to find each answer — and remembers which routes worked (well-worn paths glow brighter). The AI's memory lives outside its short-term window and is pulled in only when needed.
grep / FastContextexact symbol · recent file
BGE-M3 + Qdrantfuzzy / meaning
GraphRAG"how does X relate to Y"
Graphiti"what changed / when"
OKF filescanonical definitions
stigmergywell-worn paths brighten
Two kinds of routing
Knowledge-engine routing — a deterministic classifier sends each query to the engine that wins for that query shape (table above).
Harness routing — per task, route to the best agent backend (Claude SDK / Codex / Gemini / OpenHands…) on cost · risk · fit · prior success. We can be vendor-neutral; no vendor will route to its competitors.
Why it matters
Research finding (arXiv:2605.15184, "Is Grep All You Need?"): the harness, not the retriever, decides accuracy — so the routing decision is the thing to own, govern, and make learnable. The router itself is receipted + reinforced by what evals well.
Status — ✅ v0 BUILT
Knowledge Router v0 is BUILT (services/breaker/src/knowledge-router/): deterministic per-query engine routing (≥95% on a labeled set) + vendor-neutral per-task harness routing, every decision receipted, context as a just-in-time projection, stigmergy-reinforced, safe fallback.
5One Pen · Many Eyes
⚖️ Convergence
Many AI agents can read and argue, but only one is allowed to actually write — so they can't trip over each other. And when two agents genuinely disagree, BREAKER doesn't just pick one: it runs both as a real experiment and learns which was right.
How it actually works
Council agents share one Loro CRDT state object + Leash correlation IDs so they converge instead of diverging. Pattern: scout → plan → verify → ONE writer → merge gate (matches Cognition's "writes stay single-threaded" lesson).
Divergence = fuel (your directive): a detector flags material disagreement → forks parallel receipted A/B branches → a deterministic oracle (tests/eval, never the AI judging itself) picks the winner → winner merges, loser is kept as a learning, and the result reinforces routing.
Status — ✅ now BUILT
The convergence runtime is BUILT (services/breaker/src/convergence/, PR #129): a shared Loro CRDT council state with one deterministic writer behind an invariant merge gate (the CRDT's only mutation path), every apply sealed by a council_apply receipt; it implements the CouncilStateApply port that Plane 2 calls. The convergence protocol (scout→plan→verify→one-writer→gate→receipt) and the divergence→A/B runtime (detectDivergence → receipted A/B arms → a deterministic oracle decides → winner merged via the writer + route reinforced; no-winner → both kept, never a silent pick) are built. 25 convergence tests.
6It Gets Better · Safely
🌱 Evolution
BREAKER improves itself over time — better prompts, routing, skills — but every change is tested against a fixed yardstick, signed, and reversible. It can learn, but it can't secretly rewrite itself.
The self-improvement loop
propose → oracle-evaluate → canary → promote / roll-back. Every change is a falsifiable prediction + a typed evolution:* receipt. The optimizer can never edit its own oracle.
Three change tiers (the safety gradient)
Auto after eval: prompt wording, route weights, memory weights.
Canary + auto-rollback: agent configs, skill patches, DAG logic.
Human-approved: new tools, new spend, new op-vocab verbs, oracle changes.
Evolution Plane Phase-1 is BUILT (services/breaker/src/evolution/): council disagreement → receipted A/B arms → a deterministic oracle picks the winner (never the AI judging itself) → routing reinforced, loser kept as a learning. The optimizer is structurally barred from editing the oracle/verifier/promotion policy (proven by test). No code changes yet — prompt/route A/B only, so it's provably safe.
Start at harness level (safe); gate code self-modification behind sandbox + holdout evals + human review.
Where BREAKER sits in the real world
There's already a category of tools that give AI "durable rails" — they survive crashes and replay reliably. Our engine Kestra is one of them. BREAKER is the member of that category that is also cryptographically governed, knowledge-native, and self-improving.
The foundation is in. The next surface makes it visible and editable: a canvas in the Forge where you drag agents and wire their fan-outs — a supervisor delegating to workers, a swarm handing off, an evaluator looping on a judge. Pick a premade Council blueprint, then rearrange it freely.
The trick: the picture is the Manifest
The tree you draw isn't a diagram of the system — it is a BreakerManifest. Arrange it visually → the compiler (Plane 1, already built) turns it into real Kestra flows + Omnigent agents. Zero new orchestration — Kestra already ships the agent loop, agent-as-tool nesting, A2A, and judged loops; the canvas just edits the blueprint they run from.
Because the Leash is now live, the canvas lights up from the real run — per-node badges (pending → running → done / failed), animated delegation edges, cost + token per node — streamed from genuine Kestra execution states + Omnigent runs + signed Leash receipts. Not a mock. The cozy World and this schematic tree become two views of one true feed.
The market gap (research, 2026-06-21): no product unifies a named-pattern gallery + free-form editing + a live receipt-backed run-overlay + framework-native export for agent trees. Fogbreak can — because the export target (the Manifest) and the receipts (the Leash) already exist. Grounded in docs/design/2026-06-21-visual-council-trees.md + off-meter swarm-trees research.
🌫 Fogbreak · the BREAKER harness — Boundary-Regulated Execution & Agent Knowledge Engine Routing
Phase-0 grounded research (2026-06-23) → foundation BUILT (PR #114, 4 planes) → all 6 planes BUILT + MERGED (2026-06-24): Plane 2 Execution (PR #130) + Plane 5 Convergence (PR #129) · 101 tests · GLM-authored off-meter, Claude-orchestrated. Next surface: the Forge visual Council builder. a harness that learns — but cannot secretly mutate.