Technology Compliance Post-Quantum Proof Infrastructure Pricing Docs Blog Install Cachee Get Started
AI & Machine Learning

Verifiable AI Infrastructure

AI inference at scale requires caching outputs. But cached AI outputs have zero provenance. Which model? Which version? Which training data? Cryptographic computation fingerprints change that.

31ns
Cache read latency
3 PQ
Signature families
EU AI Act
Articles 12-15
24KB
Self-verifying bundle
What Breaks

Cached AI Outputs Have No Provenance

AI inference at scale requires caching outputs. A sepsis risk model that runs in 200ms cannot run on every API call when the same patient vitals produce the same result. A credit scoring model that costs $0.03 per inference cannot recompute for every microservice that needs the score. A content moderation model that processes 10M posts per hour cannot evaluate each one from scratch. So you cache the results.

But cached AI outputs have zero provenance. Which model produced this result? Which version? Which training data? Which prompt? Which temperature? When the EU AI Act requires traceability for high-risk AI decisions, cached inference results are a compliance gap.

Regulatory Exposure

When a patient sues over an AI clinical recommendation, the cached result has no proof of origin. When a credit decision is challenged for bias, the cached score can't prove what rules produced it. When the EU AI Act mandates Article 12 logging for high-risk systems, cached inference is a blind spot. The cache layer has become the weakest link in AI governance.

Model Drift Blindness

Models update. Training data changes. Parameters shift. But cached results from the previous version continue to serve until TTL expires. A credit model retrained to remove a biased feature still serves cached scores computed with that feature. A medical model updated with new clinical trial data still returns old predictions from cache. Stale AI outputs are invisible — indistinguishable from fresh inference.

What Leaks

Current AI Caching Has Zero Integrity

Current AI caching: store the response as a Redis key-value pair. No provenance. No integrity. No audit trail. If the model updates, stale cached results continue to serve. If the training data changes, old results persist until TTL expires. The cache has no concept of model lineage — it's just bytes in, bytes out.

Semantic caches are worse. Embeddings of personal data can't be "deleted" without rebuilding the index. A user requests data deletion under GDPR — but their query embedding is entangled in the similarity index. You can delete the key, but the semantic ghost remains.

The Prompt Injection Cache Attack

Attacker crafts a prompt that produces a malicious response. The response gets cached. Every subsequent user with a similar query gets the poisoned response. In Redis: no way to detect the poisoning. No way to trace the origin. No way to prove what happened. No way to identify which users received the poisoned result. The cache amplifies the attack surface from one user to every user.

What Changes

Every Inference Becomes Provable

Cachee changes the AI caching model. Instead of storing raw inference results with no lineage, Cachee stores signed, fingerprinted computation results that cryptographically bind every output to the exact model, parameters, and data that produced it.

Computation Fingerprint

SHA3-256(model_version || prompt_hash || temperature || top_p || system_prompt_hash || training_data_hash) — every inference result is bound to its exact inputs. Change ANY parameter and the fingerprint changes. Old results are not served. Stale inference is architecturally impossible.

Three PQ Signatures

Every cached inference is signed by three independent post-quantum signature families (ML-DSA-65, FALCON-512, SLH-DSA). Modification is detectable. Authenticity is provable. Cache poisoning is identifiable. No trust assumption required.

Hash-Chained Audit Log

Every state change is recorded: when each result was produced, verified, and served. Who read it. When the model updated. When the result was superseded. Tamper-evident. Delete an entry? Detectable. Modify an entry? Detectable. Reorder entries? Detectable.

Lifecycle State Machine

Active → Superseded (model updated) → Revoked (bias detected). Three caching patterns: exact match, semantic similarity, result-only. Verification modes: AlwaysVerify for medical/financial AI, Probabilistic for general inference. The cache knows when its contents are stale — and acts on it.

AI provenance is no longer a metadata tag. It's a mathematical property of the storage layer.

Verify This

AI Inference Verification — Live

cachee-ai-demo
[1/5] Caching AI clinical inference: sepsis risk prediction Patient: vitals_hash=8f3a91c2... | Model: sepsis_model_v3.1.2 Result: RISK_ELEVATED (score: 0.87, threshold: 0.85) [2/5] Creating computation fingerprint... Fingerprint: SHA3(patient_vitals_hash || sepsis_model_v3.1.2 || threshold_0.85 || training_data_2026Q1) Hash : 4e7b2a91f03c... [3/5] Signing with 3 post-quantum families... ML-DSA-65 : 3,309 byte signature FALCON-512 : 656 byte signature SLH-DSA : 17,088 byte signature [4/5] Verifying (no Cachee. no H33. no network.)... ML-DSA-65 : PASS FALCON-512 : PASS SLH-DSA : PASS RESULT: VALID Signed. Fingerprinted. Independently verifiable. This is not a cached response. This is proven inference. [5/5] Audit trail: AUDITLOG sepsis-risk-vitals_8f3a91c2 → Created 2026-05-02T14:30:00Z (sepsis_model_v3.1.2) → Verified 2026-05-02T14:30:01Z (3/3 signatures PASS) → Read 2026-05-02T14:32:18Z (triage nurse, ED Bay 4) → Read 2026-05-02T14:45:07Z (attending physician) → Superseded 2026-05-02T18:00:00Z (model updated to v3.2.0) Chain: INTACT (5 entries, head=b2c8d47e...)

Run it yourself: brew install cachee && cachee-gold-demo

What Becomes Possible

AI After Verifiable Infrastructure

Deterministic AI Replay

Reconstruct exactly what the model produced at any point in time. When a contested decision is challenged — "what did the AI recommend on March 14th?" — the answer is one command: AUDITLOG. The computation fingerprint proves which model, which version, which parameters. Tamper-evident. Independently verifiable.

Agent Memory Attestation

AI agents with verifiable, tamper-evident memory. Every observation, every decision, every action cached with a computation fingerprint and three PQ signatures. Agent behavior becomes auditable. Memory becomes provable. Trust becomes mathematical.

EU AI Act Compliance

Articles 12-15 require logging, transparency, human oversight, and accuracy for high-risk AI systems. Cachee satisfies these at the cache layer: hash-chained audit trails (Art. 12), computation fingerprints binding results to model versions (Art. 13), lifecycle state machine with manual override (Art. 14), fingerprint invalidation on model change (Art. 15).

Governance Lineage

Full chain from training data → model → inference → cache → verification. Bias audit capability: prove which training data version produced contested results. Encrypted inference: cache results without exposing the input data. The cache becomes the governance layer, not just the performance layer.

AI Compliance Mapping

Regulation Requirement Cachee Implementation
EU AI Act — Article 12Logging & traceabilityHash-chained audit log, Merkle anchoring, AUDITVERIFY
EU AI Act — Article 13TransparencyComputation fingerprints binding results to model version + parameters
EU AI Act — Article 14Human oversightLifecycle state machine with manual Revoke + Override states
EU AI Act — Article 15Accuracy & robustnessFingerprint invalidation on model/data change, cache poisoning detection
NIST AI RMFGovern, Map, Measure, ManageGovernance lineage from training data to verification, provenance scoring
ISO 42001AI management systemTamper-evident audit trail, lifecycle controls, deterministic replay
FDA AI/ML GuidancePredetermined change controlSuperseded state on model update, version-bound fingerprints
HIPAA (AI + ePHI)45 CFR 164.312Encrypted inference caching, access controls, audit logging, integrity

Frequently Asked Questions

How does Cachee create an AI inference audit trail?
Every cached inference result is recorded in a hash-chained, tamper-evident audit log. Each entry includes the computation fingerprint, three post-quantum signatures, and timestamps for creation, verification, and every subsequent read. The AUDITLOG command reconstructs the full lifecycle. AUDITVERIFY validates the entire chain integrity in one command.
How does Cachee make AI output verifiable?
Every cached inference carries a computation fingerprint: SHA3-256(model_version || prompt_hash || temperature || top_p || system_prompt_hash || training_data_hash). Change any parameter and the fingerprint changes — stale results are never served. Three independent PQ signature families sign every entry. Verification requires no network, no API, no trust assumption.
Does Cachee help with EU AI Act cache compliance?
Yes. The EU AI Act Articles 12-15 require traceability, transparency, human oversight, and accuracy for high-risk AI. Cachee satisfies these at the cache layer: hash-chained audit trails (Art. 12), computation fingerprints binding results to model versions (Art. 13), lifecycle state machine with manual override (Art. 14), and fingerprint invalidation on model change (Art. 15).
How does Cachee prove AI model provenance?
The computation fingerprint cryptographically binds every cached inference to the exact model version, training data version, prompt, and parameters that produced it. When a model updates, cached results from the previous version transition to Superseded state. The full governance lineage from training data to model to inference to cache to verification is reconstructable at any point.

Related Infrastructure

One architecture. Many manifestations.

Verifiable AI Infrastructure

Deploy Cachee in your VPC. Inference provenance built into the cache layer.
Every result signed. Every access audited. Every model version tracked.

Get Started Free Compliance & Audit →