BREAKER

1 The Blueprint

📐 Manifest

One document describes a whole work-area (a Shore): its steps, who's allowed to do what, which AI agents run it, and where it pauses for you. From that single blueprint, BREAKER builds both the steady assembly-line and the AI workers — so the two can never drift apart.

How it actually works

One BreakerManifest (YAML) compiles to: Kestra FlowDefinitions (the deterministic rail) + Omnigent AgentSpecs (the AI council) + Leash gate policy + bounded op-vocab + knowledge bindings.
Single source of truth → the harness is generated, not hand-wired (today we hand-author Kestra YAML and agent configs separately — they drift).
The Manifest is the search space for self-improvement (Plane 6 can propose Manifest patches).

Why it's novel

No vendor compiles one domain blueprint into both the workflow engine and the agents. (Amazon's Kiro does spec→tasks→code — close, but not both rails + agents, and not receipted.)

Status — ✅ now BUILT

The BreakerManifest compiler is BUILT (services/breaker/src/compiler/): one typed manifest → Kestra flows + Omnigent AgentSpecs + gate policy + op-vocab + knowledge bindings; all-or-nothing, idempotent, cross-artifact consistent, carries no model id/secret. Proven on the real-estate Shore.

Generalizes our existing per-Shore ShoreManifest + op-vocab.ts.

2 The Two Hands

🤝 Execution

Two hands work together: a steady hand (the workflow engine) that does the same thing every time, and a creative hand (the AI) that figures things out. Neither is the boss. Every move passes through four safety gates.

The two co-equal layers

Kestra DAG (deterministic rail: stages, If, LoopUntil, Pause) ⊕ Omnigent agent loop (AI: AIAgent, council, swap-model). The DAG isn't a wrapper around the agent; the agent isn't a task in the DAG — they're co-equal and both emit receipts.

The four boundaries

Tool — 6-phase ALLOW / ASK / DENY + a bounded op-vocabulary: the AI may only emit ops from a fixed list; off-vocab → rejected (could_not_do). Constrains the action space, not just output shape.
Sandbox — bwrap / seatbelt + egress MITM (Omnigent).
Flow — Kestra Pause = human gate that can wait days, resume from any device.
Spend — every model call routes through Ferry (BYOK + metering); the spend event itself becomes a receipt.

Status — ✅ now BUILT

The governed-turn runtime is BUILT (services/breaker/src/execution/, PR #130): runGovernedTurn drives one turn with the Kestra DAG ⊕ Omnigent loop co-equal, emitting one signed Leash receipt per consequential transition into the live D1LeashStore — the live-emission gap is closed — and verifying prior state with gatedRead (fail-closed). All four boundaries are enforced at runtime; agents only propose via a typed Work Envelope, only BREAKER commits; wired to the real OmnigentSSE / KestraRestFlowSource seams (no mocks).

Precedent: Temporal's "deterministic workflow / LLM-in-activity" split (used by OpenAI, Replit, Cursor).

3 The Receipt Chain · "BRE"

🔗 Leash

Every meaningful action gets a tamper-proof, signed receipt, chained to the one before it — a notarized logbook nothing can be secretly edited out of. If the chain ever breaks, BREAKER stops. This is the "Boundary-Regulated Execution": the AI can only move forward through signed, checked steps.

How it actually works

Every state transition (not just every tool call) → a hash-chained, ECDSA-P256-signed receipt, cross-linking Kestra executionId ⊕ Omnigent conv_id.
Receipt-gated: downstream consumers verify the chain segment before reading new state; a broken chain fail-closes the harness.
Receipts are typed — task-execution vs evolution:* — so an auditor can answer "why did BREAKER change?"

Status — ✅ now BUILT

✅ The verifier was already proven (exec-plane/leash/verify.py, CI-gated). And now the runtime emission + the fail-closed gate are BUILT (services/breaker/src/leash/): 5 transition types emit signed receipts, gatedRead halts fail-closed on any break, and a TS verifier port is conformance-proven ≡ the Python verifier — both directions (Python accepts a TS-authored chain; TS reproduces Python's verdicts byte-for-byte). The single highest-leverage unblock is closed.

Why it's novel

Durable-execution rivals (Temporal/DBOS/Restate) have logs — but mutable ones. None chain + sign across the engine⊕agent boundary. This is BREAKER's sharpest edge.

4 The Navigator · "AKER"

🧭 Knowledge Router

Instead of one dumb search, BREAKER picks the right way to find each answer — and remembers which routes worked (well-worn paths glow brighter). The AI's memory lives outside its short-term window and is pulled in only when needed.

grep / FastContextexact symbol · recent file

BGE-M3 + Qdrantfuzzy / meaning

GraphRAG"how does X relate to Y"

Graphiti"what changed / when"

OKF filescanonical definitions

stigmergywell-worn paths brighten

Two kinds of routing

Knowledge-engine routing — a deterministic classifier sends each query to the engine that wins for that query shape (table above).
Harness routing — per task, route to the best agent backend (Claude SDK / Codex / Gemini / OpenHands…) on cost · risk · fit · prior success. We can be vendor-neutral; no vendor will route to its competitors.

Why it matters

Research finding (arXiv:2605.15184, "Is Grep All You Need?"): the harness, not the retriever, decides accuracy — so the routing decision is the thing to own, govern, and make learnable. The router itself is receipted + reinforced by what evals well.

Status — ✅ v0 BUILT

Knowledge Router v0 is BUILT (services/breaker/src/knowledge-router/): deterministic per-query engine routing (≥95% on a labeled set) + vendor-neutral per-task harness routing, every decision receipted, context as a just-in-time projection, stigmergy-reinforced, safe fallback.

5 One Pen · Many Eyes

⚖️ Convergence

Many AI agents can read and argue, but only one is allowed to actually write — so they can't trip over each other. And when two agents genuinely disagree, BREAKER doesn't just pick one: it runs both as a real experiment and learns which was right.

How it actually works

Council agents share one Loro CRDT state object + Leash correlation IDs so they converge instead of diverging. Pattern: scout → plan → verify → ONE writer → merge gate (matches Cognition's "writes stay single-threaded" lesson).
Divergence = fuel (your directive): a detector flags material disagreement → forks parallel receipted A/B branches → a deterministic oracle (tests/eval, never the AI judging itself) picks the winner → winner merges, loser is kept as a learning, and the result reinforces routing.

Status — ✅ now BUILT

The convergence runtime is BUILT (services/breaker/src/convergence/, PR #129): a shared Loro CRDT council state with one deterministic writer behind an invariant merge gate (the CRDT's only mutation path), every apply sealed by a council_apply receipt; it implements the CouncilStateApply port that Plane 2 calls. The convergence protocol (scout→plan→verify→one-writer→gate→receipt) and the divergence→A/B runtime (detectDivergence → receipted A/B arms → a deterministic oracle decides → winner merged via the writer + route reinforced; no-winner → both kept, never a silent pick) are built. 25 convergence tests.

6 It Gets Better · Safely

🌱 Evolution

BREAKER improves itself over time — better prompts, routing, skills — but every change is tested against a fixed yardstick, signed, and reversible. It can learn, but it can't secretly rewrite itself.

The self-improvement loop

propose → oracle-evaluate → canary → promote / roll-back. Every change is a falsifiable prediction + a typed evolution:* receipt. The optimizer can never edit its own oracle.

Three change tiers (the safety gradient)

Auto after eval: prompt wording, route weights, memory weights.
Canary + auto-rollback: agent configs, skill patches, DAG logic.
Human-approved: new tools, new spend, new op-vocab verbs, oracle changes.

Grounded in the 2026 SOTA

Darwin-Gödel-Machine, ADAS, AlphaEvolve (evolutionary, gated) · Reflexion / Self-Refine · DSPy/MIPRO (prompt opt) · Voyager (skill libraries) · A-MEM (memory evolution) · Agentic Harness Engineering (the observability skeleton). Each maps onto a Fogbreak primitive (Leash / stigmergy / Manifest / GraphRAG / op-vocab).

Status — ✅ Phase-1 BUILT

Evolution Plane Phase-1 is BUILT (services/breaker/src/evolution/): council disagreement → receipted A/B arms → a deterministic oracle picks the winner (never the AI judging itself) → routing reinforced, loser kept as a learning. The optimizer is structurally barred from editing the oracle/verifier/promotion policy (proven by test). No code changes yet — prompt/route A/B only, so it's provably safe.

Start at harness level (safe); gate code self-modification behind sandbox + holdout evals + human review.

First, the big idea: the harness is the moat

The six planes (tap any card — the fog clears to show the deep version)

📐 Manifest

How it actually works

Why it's novel

Status — ✅ now BUILT

🤝 Execution

The two co-equal layers

The four boundaries

Status — ✅ now BUILT

🔗 Leash

How it actually works

Status — ✅ now BUILT

Why it's novel

🧭 Knowledge Router

Two kinds of routing

Why it matters

Status — ✅ v0 BUILT

⚖️ Convergence

How it actually works

Status — ✅ now BUILT

🌱 Evolution

The self-improvement loop

Three change tiers (the safety gradient)

Grounded in the 2026 SOTA

Status — ✅ Phase-1 BUILT

Where BREAKER sits in the real world

The category: durable-execution-for-agents

BREAKER adds, on top:

Built on what we already run

Next: build your Council by hand, in the Forge

The trick: the picture is the Manifest

Watch it run — for real