AI Governance

What is runtime AI governance?

Runtime AI governance is the enforcement of access, policy, and disclosure rules at the moment an AI system generates a response — rather than after the fact through audit review or before the fact through policy documents. It requires the governance layer to sit inside the inference pipeline, not alongside it.

Why this matters

Most AI systems rely on one of two governance approaches: audit after release (you catch the mistake weeks later when a customer complains) or policy before generation (you hope the model respects the prompt). Neither prevents the moment of failure.

When a model is asked “What is the cardholder data in our database?”, the answer is determined by three factors: whether cardholder data is in the retrieval pool, what the model decides to claim, and whether that claim gets verified before emission. Most systems only control one of these. Runtime governance controls all three.

In regulated industries — healthcare, finance, defense, government — this matters. You cannot tell a compliance officer “we hope the model was accurate” or “we checked the logs afterward.” You need evidence that policy was enforced during generation, not after.

How it works

Runtime governance sits at four decision points in the inference pipeline:

Evidence scope: Before retrieval, the caller’s authorization boundary is applied. Only documents they’re allowed to see go into the retrieval pool. This is enforced by REBAC (relationship-based access control) — the system knows the caller’s identity, their role, their clearances, and which evidence they can access.
Retrieval boundaries: The retrieval engine respects policy zones defined by evidence metadata. Some evidence is classified as “policy-bearing” (regulations, mandates, must-enforce). Some is “format-bearing” (schemas, structures). Some is “advisory” (context, narrative). The model sees all three, but the Claim Ledger tracks which type backed each claim.
Policy gates: Before emission, gates fire based on the request, the evidence, and the claimed output. A gate might ask: “Does this claim appear in the retrieved evidence?” or “Is this claim authorized at the caller’s clearance level?” or “Does this output state meet policy?” If gates fail, the output state changes (PARTIAL, REQUIRES_SPEC, BLOCKED).
Audit trail: The Claim Ledger records every decision. What was in scope, what was retrieved, what gates fired, what the model claimed, what the final output state was. That ledger is tamper-evident and exportable — an auditor or regulator can replay it.

None of this requires changing your model. The governance layer wraps the inference pipeline, so you can use any model (GPT, Claude, Llama, etc.) and any orchestration framework. The boundary is enforced in software, not in model weights.

How Kenshiki Labs implements this

Kenshiki Labs provides the three components that make runtime governance possible:

Kura (evidence boundary): Stores governed source material with provenance, structure, and retrieval boundaries. Every chunk is tagged with SIRE identity (what it covers, what it relates to, what it must never answer). Evidence is tenant-scoped and REBAC-enforced — the caller only sees what they’re authorized to access.
Prompt Compiler and Boundary Gate: Maps evidence metadata to policy zones and enforces gates at decision points. The compiler converts policy into prompt structure; the gate verifies the output against evidence.
Claim Ledger: Records every retrieval decision, REBAC resolution, gate outcome, and output state. Exportable in JSON/CSV. Tamper-evident (HMAC-SHA-256 signed on each chunk).

Kenshiki Labs runs on three deployment tiers, all with the same governance semantics:

Workshop: Shared infrastructure. Deployment in hours.
Refinery: Private VPC. Your infrastructure, Kenshiki Labs in your boundary.
Clean Room: Air-gapped. Zero network path. Full governance chain verifiable offline.

Frequently asked questions

Is runtime AI governance the same as a GRC platform?

No. GRC platforms manage governance policies and compliance documentation after the fact. Runtime AI governance enforces policy at the moment of inference — before the model generates a response. A GRC tool tells you what you should have done; runtime governance prevents you from doing the wrong thing. They complement each other, but they operate at different points in the lifecycle.

How does runtime governance interact with an EU AI Act compliance program?

The EU AI Act requires demonstrating that high-risk AI systems operate within documented boundaries and leave an auditable trail. Runtime governance provides that boundary enforcement (via policy gates and evidence scoping) and audit trail (via the Claim Ledger). It turns compliance requirements into architecture, not just documentation.

Can I add runtime governance on top of my existing RAG pipeline?

Yes. Runtime governance doesn't replace RAG — it sits inside it. The evidence retrieval layer, the prompt composition, the model inference, and the output admission decision all become policy-enforced. You keep your existing retrieval, but governance-wrap each stage so evidence scope and policy decisions are explicit and auditable.

What does runtime governance do that a prompt-injection guardrail doesn't?

A guardrail detects and blocks malicious prompts. Runtime governance prevents legitimate requests from accessing evidence they're not authorized for, ensures the model only retrieves from approved sources, and verifies that each claim in the response is grounded in that evidence. Guardrails are defensive; runtime governance is structural.

How do I prove to an auditor that runtime governance was actually enforced?

Every inference produces a Claim Ledger entry: what evidence was in scope, what was retrieved, which gates fired, what output state was assigned. That ledger is tamper-evident (HMAC-SHA-256 signed), exportable (JSON/CSV), and traces each claim back to its source. An auditor can replay any decision and verify the boundary was enforced.

Does runtime governance work with open-weight models or only frontier models?

Runtime governance is model-agnostic. The boundary is enforced around the model (evidence scoping, gate decisions), not inside it. You can use any model — from Llama to GPT — because the governance layer sits in the orchestration pipeline, not in the model weights.

What happens if the model hallucinates a claim that contradicts the evidence?

The Claim Ledger detects this and the output state is set to REQUIRES_SPEC or BLOCKED. The Ledger records what the evidence actually contained, what the model claimed, and the mismatch. An operator can then review and decide whether to emit a partial response, ask for clarification, or block the output entirely.

Can runtime governance prevent data leakage from evidence to unauthorized users?

Yes. Evidence is scoped by the caller's authorization boundary (REBAC). If a user isn't authorized to see a document, that document isn't in the retrieval pool for their query. The model can't retrieve what it's not allowed to access, so the evidence boundary becomes a data boundary.

Related concepts

Governed agency — How to apply runtime governance to autonomous agents
What is a governance control plane? — The architecture that makes runtime governance possible
Claim Ledger — The audit trail that proves runtime governance was enforced
REBAC (relationship-based access control) — How evidence scope is enforced per caller