AI Auditability
What is chain-of-custody for AI outputs?
Chain-of-custody for AI outputs is a complete, tamper-evident record of where evidence came from, what the model was allowed to see, what it claimed, whether those claims were grounded in evidence, and why the output was approved for emission. It creates verifiable provenance for every answer.
Why this matters
When an AI system gives a wrong answer, someone always asks: “What was it trained on?” or “What data did it have access to?” In regulated industries, those aren’t casual questions — they’re legal requirements. You need to prove what the system could have known.
Chain-of-custody for AI outputs fills that gap. It’s not just logging what the model said; it’s logging what the model was allowed to see, what it actually retrieved, and whether each claim in its response was grounded in that evidence.
How it works
Every AI inference produces a chain-of-custody record that includes:
- Source evidence: What documents, databases, and knowledge sources were in the retrieval pool for this request?
- Evidence scope: Which of those sources did the model actually retrieve from?
- Claim mapping: For each claim in the response, what evidence supports it?
- Verification: Did gates verify that each claim matched its source evidence?
- Cryptographic proof: Is the record signed and tamper-evident?
This allows an auditor to ask: “Did the model have access to cardholder data?” and get a provable answer. “Was this decision made with current or stale evidence?” Provable. “Could the model have retrieved a different answer if asked the same question today?” Verifiable.
How Kenshiki Labs implements this
Kenshiki Labs’ Claim Ledger is the chain-of-custody engine. Every inference decision — retrieval boundary, gate outcome, output state — is recorded in a tamper-evident ledger. The ledger is:
- Cryptographically signed (HMAC-SHA-256 on each chunk).
- Exportable (JSON, CSV, audit-friendly formats).
- Complete (traces each claim back to its source evidence).
- Replayable (an auditor can step through the decision tree).
The ledger becomes your evidence file when regulators ask for proof.
Frequently asked questions
How is chain-of-custody for AI different from model audit logs?
Model audit logs capture the input prompt and the output response. Chain-of-custody for AI captures what evidence was available, what the model retrieved, what it was allowed to claim, and why the output was approved for emission. A log tells you what happened; chain-of-custody proves what the model was authorized to do.
Can I use chain-of-custody to defend against a liability claim?
Yes. If a customer claims your AI system gave them wrong information, you can produce the Claim Ledger and show: (1) what evidence was in scope, (2) what the model retrieved, (3) what it claimed, (4) whether that claim was in the evidence, and (5) what output state was assigned. If the output was BLOCKED, you have proof you tried to prevent it. If PARTIAL, you have proof you warned the caller.
Do regulators require chain-of-custody for AI?
The EU AI Act requires high-risk AI systems to maintain detailed records of how decisions were made. The NIST AI Risk Management Framework recommends traceability of AI behavior to human oversight. Chain-of-custody is the mechanism for both. You're not required to call it that, but you are required to demonstrate it.
Is chain-of-custody the same as explainability?
No. Explainability is about understanding why the model made a decision (interpretability). Chain-of-custody is about proving what the model was allowed to do and what it actually did. You can have a fully explainable model with no chain-of-custody, or chain-of-custody with a black-box model. Both matter, separately.
Can I generate chain-of-custody retroactively from my existing logs?
Partially. If your existing logs record what evidence was retrieved and what the model claimed, you can reconstruct the chain. But if your logs only capture the prompt and response, you cannot retroactively know what was in scope or why gates fired. This is why governance needs to be built in, not bolted on.
What happens if the model hallucinates something in the response?
The Claim Ledger shows what evidence was retrieved and what the model claimed. If the claim doesn't appear in the evidence, the output state is set to REQUIRES_SPEC or BLOCKED — depending on policy. The ledger creates the proof that the system caught (or failed to catch) the hallucination.
How long do I need to keep chain-of-custody records?
That depends on your regulation and use case. For healthcare, typically 3-10 years. For financial, often 6-7 years. For government contracts, sometimes longer. The advantage of chain-of-custody is that it's small (just the evidence IDs, retrieval decisions, gates, output state) so archival is not expensive.
Can chain-of-custody prove that my AI system is safe?
It proves that your system operated within its documented boundaries and left evidence of that operation. That's not the same as proof of safety, but it's the foundation. A safe system is one where boundaries are defined, enforced, and auditable. Chain-of-custody provides the third part.
Related concepts
- What is runtime AI governance? — The architecture that enables chain-of-custody
- Claim Ledger — The data structure that stores the chain-of-custody record
- Output states — How Kenshiki Labs signals confidence in an answer