System topology

Architecture

Last updated April 26, 2026

Every governed AI evaluation produces a structured record — same format whether you run Workshop, Refinery, or Clean Room.

The Kenshiki Labs platform architecture is a three-plane runtime contract — build, orchestration, and control — that produces a structured per-decision record on every governed AI response. The same audit chassis runs across all deployment tiers; what changes between Workshop, Refinery, and Clean Room is not the contract but the depth: more telemetry, stronger enforcement, and in Clean Room a signed attestation chain anchored to hardware. This is the canonical end-to-end specification operators integrate against.

Every evaluation produces a structured record — same format whether you're running Workshop, Refinery, or Clean Room. The audit record gets deeper as you move up: more telemetry, stronger enforcement, and in Clean Room, a signed attestation chain anchored to hardware.

Sanitize

prepare

Compile

prepare

Generate

propose

Evaluate

prove

Output

prove

Kenshiki Labs control plane · Signed envelope · Chain of custody

Kura scopes context

Ledger validates claims

Your data · Outside Kenshiki Labs

Prepare

Before the model does anything, the request is bound to identity and to an approved evidence boundary. The Prompt Sanitizer authenticates the caller and propagates the access scope. Kura scopes retrieval to the governance corpus, with access controlled at the claim level. The model never sees data the caller is not authorized to use.

Identity binding via OpenFGA/ReBAC at the entry point
Retrieval scoped to the governance corpus and the caller's access boundary
SIRE exclusion gate purges out-of-scope chunks before they reach the model
The model is constrained by what reaches it, not by what comes back

Propose

The model is allowed to do useful reasoning and generation — but only inside the approved boundary. The Compiler rewrites the request into a CFPO-ordered prompt contract (Content–Format–Policy–Output) using five deterministic passes. Generation receives bounded context, not raw corpus access. Kenshiki is not making the model smaller; it is constraining the execution environment around it.

CFPO ordering matches model attention behavior
Five-pass deterministic rewrite — same input always produces the same contract
Generation receives only authorized evidence
The model stays whole — execution is what changes

Prove

Every decision is written to an immutable claim ledger with provenance. The Ledger decomposes the response into atomic claims and verifies each at L1–L4 depth (token confidence, source entailment, multi-draw stability, hidden-state probes). The Gate assigns one of five output states deterministically. ARBV produces signed Boundary Evidence Records that auditors can independently replay.

L1–L4 claim evaluation, depth varies by tier
Five output states — AUTHORIZED, PARTIAL, REQUIRES_SPEC, NARRATIVE_ONLY, BLOCKED
Signed Boundary Evidence Records, replayable by auditors and partners
The artifact becomes the system of record; the model stays whole

The Contract

Kura is the evidence store — you put source material in with provenance, structure, and retrieval boundaries. Kadai is the reasoning API — you query it and get back answers grounded in what Kura contains. The model renders — Kura decides what counts.

Kura: source material with provenance, structure, and retrieval boundaries
Kadai: answers grounded in what Kura contains
Same contract across Workshop, Refinery, and Clean Room

What Happens at Runtime

A question enters. The Compiler rewrites it into a constrained query using CFPO (Content–Format–Policy–Output). The Crosswalk retrieves only governed evidence relevant to the question and the caller's access boundary (OpenFGA/ReBAC). The generation layer produces a proposal from that bounded context. The Claim Ledger decomposes the proposal into claims, checks each against evidence using contrastive causal attribution alongside calibrated confidence and entailment signals, and records what's supported, unsupported, or missing. The Boundary Gate makes the final release decision.

Compiler: loose prompt → disciplined, governed query (CFPO)
Crosswalk: SIRE-scoped retrieval by evidence + caller identity (OpenFGA/ReBAC)
Generation: model produces a proposal from bounded context
Claim Ledger: claims decomposed, checked via contrastive attribution, recorded
Boundary Gate: deterministic emission decision over versioned evidence and policy

Output States

Every response carries an explicit state. "No evidence, no emission" means no unsupported decision-grade claim is emitted as authorized. The system can surface partial or narrative responses — but labels them so the caller knows what they're looking at.

AUTHORIZED: claims sufficiently supported by evidence
PARTIAL: evidence exists but coverage incomplete
REQUIRES_SPEC: question needs tighter scope
NARRATIVE_ONLY: descriptive but not decision-grade
BLOCKED: policy or evidence conditions not met
Qualifier — DEGRADED_BOUNDARY: any state may carry this when the Kura evidence boundary was incomplete

Platform Systems

The tiers define enforcement depth. These systems define how governed inference is built, measured, and improved.

Kura — evidence store. Aurora PostgreSQL with pgvector and tenant-scoped RLS.
Kadai — reasoning API. Returns responses grounded in Kura, with claims checked and states assigned.
Prompt Compiler — rewrites prompts using CFPO. Compiled, versioned, machine-parseable.
Crosswalk — retrieval + access control. Builds the authority map, enforces per-caller evidence scoping via OpenFGA/ReBAC.
Claim Ledger — L1–L4 evaluation. Decomposes responses into atomic claims, records confidence signals, source entailment, stability, and contrastive causal attribution.
Boundary Gate — emission. Deterministic gate decisions over versioned evidence and policy.
Neurosurgery — observability. In Workshop: returned telemetry and repeat-pass behavior. In Refinery/Clean Room: local model telemetry and hidden-state probes.

How Tiers Change the Assurance Boundary

One pipeline. Three deployment models. The difference is where the model runs and how much runtime evidence you have about what it did.

Workshop: shared Kadai or model API gateway. Full pipeline audit. L1–L3 evaluation (no hidden-state probes).
Refinery: private inference. Full audit plus local telemetry and chain of custody.
Clean Room: air-gapped, hardware-rooted. Full audit, local telemetry, signed attestation chain, and strong support for third-party review.

Kenshiki Labs Is and Is Not

Is a governance pipeline that gates claims against evidence
Is the control plane across all three planes — build (Kura, Compiler), orchestration (Kadai), and control (Ledger, Gate)
Is not a model — it governs the generation layer, doesn't replace it
Is not a content filter — it checks evidence, not tone or topic
Is not a monitoring tool — it intervenes before emission, not just after
Is not a replacement for your data — it checks against it

Runtime Infrastructure

Same infrastructure discipline that applies to the synthesis pipeline applies to the systems running it.

Network: separate VPCs for web/auth and inference workloads
Identity: Clerk (Workshop) / customer IdP (Refinery, Clean Room) with JWT propagation
Access: OpenFGA/ReBAC — per-caller, per-document evidence scoping at retrieval
Data: Aurora PostgreSQL with tenant-scoped row-level security
Ingestion: GPU-accelerated parsing (Docling — DocLayNet, TableFormer, EasyOCR), two-stage pipeline, provenance chain from upload through embedding
Inference: dedicated GPU instances (NVIDIA L40S), model artifacts verified at boot, digest-pinned images, vLLM with fp8 KV cache
Isolation: embedding and inference on separate hardware
Deploy: CDK-managed, gated manifest with pre-flight checks and rollback, services scale to zero when idle

Telemetry and Enforcement

Structured telemetry at every pipeline stage. In Refinery and Clean Room (local model access): inference request logs, logprob distributions, entailment scores, and ablation signals. Access control enforced by OpenFGA/ReBAC at retrieval — the model only sees evidence the caller is authorized to use.

Logprobs, entailment scores, and coverage metrics per response
OpenFGA/ReBAC enforces per-caller evidence scoping
CFPO ensures deterministic, auditable prompt structure
Every prompt versioned, compiled (not authored), machine-parseable

What Your Auditor Gets

A structured record for every evaluation. What was asked, what was in scope, what claims were made, what held up, what didn't, and why the state was assigned. Same format across tiers — enforcement depth and attestation grow as you move up.

Per-claim audit trail with source attribution, layer scores, and gate reason codes
Complete request provenance including model, evidence source, embedding, and compiler versions
Structured telemetry for observability and audit surfaces
In Clean Room, every step signed and anchored to verified hardware

Start Where You Are

Most teams progress in stages rather than jumping straight to the highest-assurance environment.

Workshop (hours): start on shared infrastructure with Kadai or your existing public model APIs. Retrieval, claim checking, output states — same contract either way.
Refinery (days to weeks): private deployment. Governed data sources, private inference engine. Full attribution at the model boundary.
Clean Room (weeks to months): signed everything. Attested execution. Air-gapped. For when a court or regulator asks to inspect every step.

Go deeper

Workshop

Start with shared Kadai or your existing model APIs. See what governed synthesis does before you move the stack.

Refinery

Private deployment with stronger controls and telemetry at the model boundary.

Clean Room

Air-gapped, hardware-rooted, and built for high-assurance review.

Claim Ledger

The verification engine inside every response across all tiers.

Integrations

How Kenshiki Labs plugs into AI factories, enterprise SSO, evidence systems, and GRC workflows.