Engineering

March 8, 2026 · 6 min read

The Spine Is Live

We stopped talking about provable governance and started running it. A field report from the day the full chain closed.

By AmplefAI

The day started with drift.

Nexus — our control plane agent — couldn't find his own workspace. Paths in the bootstrap were wrong. A core identity file pointed to a directory that had moved weeks earlier. Parts of the fleet inventory were stale. The system still functioned, but the operating environment around it had started to rot.

This is what happens when you let an autonomous environment accumulate too much memory, too many responsibilities, and too little cleanup. It keeps going for longer than it should. And then, one day, you realise you're no longer building on top of a clean foundation.

So I stopped building and started cleaning.

By the end of the day, the full governed execution spine was proven end-to-end: real implementations, real cryptographic signing, real replay. Not a simulation. Not a mock. A governed action that passed through every layer of the stack and came out the other side with proof.

Here's how we got there.

What We Mean by "The Spine"

The governed execution spine is a chain of cryptographically linked steps. Each step produces a verifiable artifact. Each step fails closed — meaning if it fails, nothing downstream executes. There is no ungoverned fallback.

Capture Context

PCK

Hash

SHA-256

Persist

Store

Sign Token

GEI Ed25519

Execute

Adapter

Replay

Verify

Fail-closed at every step. No ungoverned execution fallback.

What each step proves:

Capture — what the agent knew at the moment of decision
Hash — a deterministic fingerprint of that knowledge state
Persist — the knowledge state is recorded before anything happens
Sign — the action is cryptographically bound to the knowledge state and policy
Execute — the action runs, accompanied by the signed token
Replay — the entire chain can be reconstructed after the fact

If you can replay the chain, you can answer three questions:

What did the agent know? What was it authorised to do? What did it actually do?

With proof, not inference.

What Actually Happened

Once the boot files were fixed and Nexus could orient cleanly again, I needed a coding surface that wouldn't pollute the control plane.

Nexus had been doing too much directly: implementation work, repo management, code changes, test runs. Over time, that created the same problem we are trying to solve architecturally — blurred boundaries between control, execution, and memory.

So we created Anvil. Not an agent in the fleet. A dev surface: Claude Code running repo-locally with a sharply scoped context file defining exactly what it can touch and what it must escalate.

Anvil writes code and runs tests. Nexus keeps architectural coherence. That separation matters.

Anvil's first task was the PCK write path: making the context kernel actually produce snapshot data that the rest of the spine could consume.

The Import Decision

The first real architectural question was simple: how does the kernel talk to the store?

They live in different packages. There's no monorepo workspace wiring that makes that boundary disappear. The obvious answer — a deep relative import — would have worked today and become permanent sludge tomorrow.

So we used dependency injection instead.

The kernel produces snapshot data. It does not know the store exists. The orchestrator layer wires them together through injected ports. The caller provides the real implementations.

Layer Separation

kernel.ts

(PCK = context producer)

produces snapshot data

src/index.ts

(orchestrator, injected ports)

hashes, persists, signs

packages/gei-core

(governance / execution)

token signing, replay

adapter

executes with signed token

replay

reconstructs the proof chain

Each layer has one job. No layer reaches into another.

That matters more than it sounds. If the kernel imports the store directly, you collapse the boundary between what the agent knew and where that knowledge is recorded. Those are different questions with different trust properties. Keeping them separate is what makes replay meaningful.

The Persistence Path

We wired the spine in two stages.

First came the persistence path: snapshot capture, hashing, and store persistence inside the orchestrator's govern() flow. The rule was simple: if any step fails, execution aborts. No fallback. No "continue ungoverned."

Anvil delivered this cleanly: injected port types for the hash function and the store, with the orchestrator receiving capabilities rather than importing implementation details.

The Signing Path

Next came token signing.

The snapshot hash from the persistence path flows into the GEI token payload. That token is signed with Ed25519 and accompanies the execution request.

One design question surfaced here: where does workflow_id belong?

The easy answer was to put it on config as a static kernel value. The better answer was to make it request-scoped. Once the system runs mixed workloads — and it will — a static workflow ID becomes a shortcut that hardens into a constraint.

So we made it explicit per call, with a fallback derived from action type when needed.

The Contract Gaps

Then we ran the end-to-end smoke test: real implementations, real key pair, real snapshot store, real signing, real replay.

It failed twice before it passed.

Both failures were contract gaps — assumptions that held under mocks but broke with real integration:

Gap 1: agent_id semantics. The snapshot store expected agent_id to be the GEI signing authority's keyId, not the intent agent's name. Replay verification checks snapshot.agent_id === decision.key_id. That is the right contract — the question is who signed the governed action — but nothing in the type system had enforced it.

Gap 2: window_end semantics. The snapshot window is not just the time range of the underlying entries. It is the validity horizon for the decision. Replay verification checks that the token's issued_at falls within that snapshot window. Even minimal clock drift between capture and signing caused failure. The fix was to set window_end = now + tokenTtlMs at the orchestration layer, matching the pattern established in INT-002.

These are exactly the kinds of issues that only surface when you stop proving the idea and start running the system.

After those fixes, the smoke test passed:

govern() returned decision: allow, execution: success
The GEI chain verified: signature, chain, snapshot, execution
The PCK chain verified: deterministic hash across replay paths

What did the agent know?

PCK snapshot captured, hashed, and persisted before action

What was it authorised to do?

GEI token signed, binding snapshot hash to policy and action

What did it actually do?

Execution recorded in an append-only, hash-chained ledger

Can you prove it?

Deterministic replay reconstructs the chain after the fact

Every question answered with verifiable evidence. Not logs. Proof.

What This Proves — and What It Doesn't

What it proves: the governed execution primitive works.

You can capture an agent's knowledge state, bind it cryptographically to an authorised action, execute that action with a signed token, and replay the decision chain after the fact. Every step fails closed. The chain is deterministic. The full test stack is green.

What it does not prove: that this already works in someone else's stack, at their scale, under their governance model, with their compliance obligations.

That's not a hedge. It's the honest next step.

A primitive that works in our stack is validated technology. A primitive that works in someone else's stack is a product.

We're looking for design partners with real agent workloads and real governance requirements: financial services firms under DORA pressure, enterprises deploying agents into regulated operations, and teams that know "just keep the logs" is not going to be enough.

The spine is live. The primitive is real.

Now we find out whether the world needs it.

Models will change. Agents will restart. Vendors will come and go. The question is whether your rules, your knowledge, and your audit trail survive the transition.

AmplefAI builds the independent governance layer that ensures AI capability remains accountable to your institution — not your provider.

Learn more at amplefai.com

AmplefAI

Continue Reading

Perspective

Follow the thinking

We're building the constitutional layer for autonomous AI — in public. Get new posts delivered.

No spam. Governance-grade email only.