Why not just hash the output to create a cache key?

Hashing the output creates collisions between semantically different results. Consider: a STARK verification and a simulation both produce 'verified: true.' If you hash the output, they share a cache entry -- but they represent fundamentally different levels of assurance. A computation fingerprint hashes the provenance, not the result. The fingerprint answers 'how was this computed?' not 'what was computed?' This distinction is critical for audit trails, regulatory compliance, and any system where the origin of a result matters as much as the result itself.

What fields are included in a computation fingerprint?

A computation fingerprint includes five fields: (1) input_hash -- SHA3-256 of the input data, ensuring different inputs never share a cache entry; (2) computation_hash -- SHA3-256 of the function, circuit, or program that was executed; (3) parameter_hash -- SHA3-256 of the configuration parameters (FHE params, STARK config, key sizes, etc.); (4) version -- the engine name, semantic version, and circuit ID; (5) hardware_class -- whether the computation is Deterministic, NearDeterministic, or NonDeterministic. All five are concatenated and hashed with SHA3-256 to produce the final fingerprint.

Is computation fingerprinting the same as memoization?

No. Memoization caches f(x) = y -- the function and its arguments determine the cache key. Computation fingerprinting caches 'f(x) = y, computed by engine v1.2.3 with parameters P on deterministic hardware, signed by three post-quantum families.' The cached result is a provable fact with a cryptographic chain of custody, not a convenience optimization. Memoization has no concept of versioning, parameter binding, hardware class, or cryptographic attestation. If you upgrade the engine, memoization serves stale results. Computation fingerprinting creates a new cache entry because the version field changed.

SHA3-256 Content-Addressed Not Memoization

Computation Fingerprinting

Q: What is a computation fingerprint?

A computation fingerprint is a cryptographic digest that uniquely identifies a cached result by binding it to the exact inputs, computation function, parameters, engine version, and hardware class that produced it. The formula is SHA3-256(input_hash || computation_hash || parameter_hash || version || hardware_class). Two identical outputs from different computations produce different fingerprints and are stored as separate cache entries. This is fundamentally different from memoization, which only considers the function and its arguments.

A cached result is not just a value. It is a provable fact.
The fingerprint binds it to the exact inputs, function, parameters, and engine
that produced it. Same output, different computation? Different cache entry.

Fields in Every Fingerprint

32B

Fingerprint Size (SHA3-256)

~40ns

Fingerprint Compute Time

PQ Signatures on Result

Definition

A computation fingerprint is a cryptographic identity for a cached result. It is SHA3-256(input_hash || computation_hash || parameter_hash || version || hardware_class). It proves not just what the result is, but how it was produced. Two identical outputs from different computations produce different fingerprints and occupy different cache entries. This is the foundation that makes cached results verifiable -- not just reusable.

The Key Insight

Same Output. Different Computation. Different Fingerprint.

This is the distinction that separates computation fingerprinting from every other caching strategy. The output alone does not determine the cache entry. The provenance does.

Computation A: STARK Verification

Input: 47KB STARK proof

↓

Function: FRI + constraint evaluation

↓

Parameters: Goldilocks field, 64-bit

↓

Output: "verified: true"

Fingerprint: 7a3f...c821

Computation B: Simulation

Input: test fixture data

↓

Function: mock verifier (always true)

↓

Parameters: none (simulation)

↓

Output: "verified: true"

Fingerprint: e91b...4f07

Both computations produce the same output: "verified: true." But Computation A is a cryptographic proof verified through the FRI protocol (see ZK proof caching). Computation B is a mock that always returns true. If you hash the output, they share a cache entry. With computation fingerprinting, they are distinct entries with distinct cryptographic provenance. An auditor can tell which is which.

Anatomy

Five Fields. Each One Prevents an Attack.

input_hash

SHA3-256 of the input data

The raw input to the computation -- ciphertext bytes, proof bytes, query parameters, template data. Different inputs always produce different fingerprints, even if the output happens to be the same.

computation_hash

SHA3-256 of the function / circuit / program

What function was executed? A STARK verifier, a BFV inner product, a PLONK circuit? The computation hash identifies the exact code path. Different functions on the same input produce different fingerprints.

parameter_hash

SHA3-256 of configuration parameters

FHE parameters (N, Q, t), STARK configuration (field, blowup factor, num queries), key sizes, security levels. The same function with different parameters produces different fingerprints.

version

Engine name + semver + circuit ID

Which engine version produced this result? Engine upgrades that fix bugs or change behavior produce new fingerprints. Old results from old versions are never served after an upgrade.

hardware_class

Deterministic / NearDeterministic / NonDeterministic

Is the computation bit-reproducible on any hardware? NearDeterministic (CKKS floating-point) has bounded error. NonDeterministic results (random sampling) are cached but flagged accordingly.

content_address

The final cache key

The content address is SHA3-256(primitive || content_hash || fingerprint.digest()). This is the actual key used to store and retrieve the cached result. It incorporates the primitive type and the content itself alongside the fingerprint.

The Formula

Content Address Construction

The computation fingerprint feeds into the content address -- the final key under which the result is stored and retrieved. The content address includes three components:

        // Step 1: Build the computation fingerprint
fingerprint = SHA3-256(
    input_hash           // SHA3-256 of input data
    || computation_hash  // SHA3-256 of function/circuit
    || parameter_hash    // SHA3-256 of parameters
    || version           // engine name + semver + circuit ID
    || hardware_class    // Deterministic | NearDeterministic | NonDeterministic
)

// Step 2: Build the content address (the cache key)
content_address = SHA3-256(
    primitive            // "stark_verify" | "bfv_inner_product" | "ckks_inference" | ...
    || content_hash      // SHA3-256 of the cached result itself
    || fingerprint       // the 32-byte digest from Step 1
)
    

The content address is 32 bytes. It is deterministic -- the same computation with the same inputs, parameters, and version always produces the same content address. It is collision-resistant -- SHA3-256 has 128 bits of collision resistance. And it is fast -- computing the fingerprint and content address takes roughly 40 nanoseconds, negligible compared to the computation it represents.

The Distinction

This Is Not Memoization

Memoization is a convenience. Computation fingerprinting is an audit trail.

Memoization

f(x) = y

Function + arguments = cache key

Memoization caches the result of a pure function given its arguments. It has no concept of versioning, parameter binding, hardware class, or cryptographic attestation. If the engine has a bug and you deploy a fix, memoization serves the old buggy result. If two different functions produce the same output, memoization cannot distinguish them. There is no provenance, no signature, no verifiability.

Computation Fingerprinting

f(x) = y, proved

Inputs + function + params + version + HW class = cache key

Computation fingerprinting caches "f(x) = y, computed by engine v1.2.3 with parameters P on deterministic hardware, signed by ML-DSA-65 + FALCON-512 + SLH-DSA." The cached result is a provable fact, not a convenience. Upgrade the engine? New fingerprint, new cache entry. Change parameters? New fingerprint. The result carries its own proof of provenance.

Property	Memoization	Computation Fingerprinting
Cache key includes version	No	Yes
Cache key includes parameters	No	Yes
Cache key includes hardware class	No	Yes
Result is cryptographically signed	No	Yes (3 PQ families)
Independently verifiable	No	Yes (cachee-verify)
Distinguishes identical outputs	No	Yes (provenance-based)
Survives engine upgrades correctly	No (serves stale)	Yes (new fingerprint)
Audit trail	None	Full chain of custody

Security Analysis

Why Every Field Matters

Remove any field from the fingerprint, and the cache becomes unsound. Each field prevents a specific class of attack.

Without input_hash

Input confusion attack

Different inputs could share a cache entry. Patient A's encrypted diagnosis result could be returned for Patient B's query, because the function, parameters, and version are the same. The input is the only distinguishing factor.

Without computation_hash

Function substitution attack

Different functions could be confused. A "sum" aggregation result could be returned for an "average" query on the same encrypted dataset. Both operate on the same inputs with the same parameters, but compute fundamentally different things.

Without parameter_hash

Parameter downgrade attack

Results from weaker security parameters could be served for queries expecting stronger security. A BFV-128 result (lower security) could be returned for a BFV-256 query (higher security). The client believes they have 256-bit security when they do not.

Without version

Stale result attack

An old buggy engine's results could be served after an upgrade. If engine v1.2.2 had a rounding error that produced incorrect BFV decryptions, those incorrect results would persist in cache and be served to v1.2.3 clients indefinitely.

Without hardware_class

Reproducibility confusion

A non-deterministic result (from randomized sampling) could be treated as deterministic. Or a NearDeterministic CKKS result (with bounded floating-point error) could be treated as exact. The consumer cannot assess result reliability.

Without content_hash

Result tampering attack

If the content address does not include a hash of the result itself, a cache poisoning attack could replace the stored value with a forged one while keeping the same fingerprint. The content hash binds the address to the actual bytes stored.

Integration

How It Works in Cachee

Computation fingerprinting is built into Cachee's core operations. You do not compute fingerprints manually. The Cachee SDK and CLI handle fingerprint creation, storage, and verification.

SET with Fingerprint

When you SET a value with a computation fingerprint, Cachee stores the result, signs it with three PQ families, and generates the H33-74 receipt. The fingerprint is embedded in the receipt and used as the content address.

GETVERIFIED

GETVERIFIED retrieves the cached result and returns the computation fingerprint alongside it. The consumer can see exactly which computation produced this result -- the inputs, function, parameters, version, and hardware class.

cachee-verify

The cachee-verify CLI tool checks the fingerprint against the H33-74 receipt and the three PQ signatures. It verifies that the stored result matches the fingerprint, that the signatures are valid, and that no tampering has occurred. No network call. No Cachee account. No trust in any third party.

Live Demo

Fingerprint Creation and Verification

cachee-fingerprint-demo

$ cachee fp create \

--input-hash a7c3f9...2b41 \

--computation "bfv_inner_product" \

--params "N=4096,Q=56bit,t=65537" \

--version "cachee-engine/1.2.3" \

--hw-class "Deterministic"

Fingerprint: e4a1b7c3...9f2d (32 bytes)

Content address: 3c7f...a812

$ cachee-verify 3c7f...a812

ML-DSA-65: VALID FALCON-512: VALID SLH-DSA: VALID

Fingerprint verified. Result is authentic and untampered.

Run it yourself: brew install cachee && cachee fp create --help

Applications

Where Fingerprinting Matters

📋

Audit Trails

Regulators ask: "How was this result produced?" The fingerprint answers with cryptographic specificity -- inputs, function, parameters, version, hardware.

🛡

Multi-Engine Systems

Multiple compute engines produce results. Fingerprinting ensures each engine's results are isolated. Engine A's cache never contaminates Engine B's results.

🔄

Rolling Upgrades

During a deployment, old and new engine versions run simultaneously. Fingerprinting ensures v1.2.2 results are never served to v1.2.3 clients, even from the same cache.

💰

Financial Compliance

SOX, SOC 2, and MiFID II require provenance for computed results. The fingerprint provides a cryptographic chain of custody that satisfies auditors.

🧠

ML Model Versioning

Model v2 produces different predictions than v1. Fingerprinting isolates cached predictions by model version, preventing stale predictions from being served.

👤

Privacy-Preserving Systems

FHE and MPC computations must be reproducible. The fingerprint proves the cached result came from a genuine encrypted computation, not a simulation.

Detail

Hardware Class: Why It Exists

Not all computations are bit-reproducible. The hardware class field classifies each cached result by its reproducibility guarantee, so consumers know exactly what they are getting.

Hardware Class	Meaning	Examples	Cache Behavior
Deterministic	Bit-for-bit identical on any hardware	BFV FHE, STARK verification, SHA3 hashing	Full caching -- result is exact
NearDeterministic	Bounded error, reproducible within precision	CKKS FHE, floating-point ML inference	Cached with precision metadata
NonDeterministic	Result depends on randomness or hardware timing	Monte Carlo sampling, random encryption nonces	Cached but flagged -- consumer decides

The hardware class is set by the computation engine, not by the user. When a BFV inner product is computed, the engine automatically sets hardware_class to Deterministic because BFV integer arithmetic is exact. When a CKKS inference is computed, it is set to NearDeterministic with a precision bound attached. This metadata travels with the cached result and is exposed to consumers via GETVERIFIED.

Install

Get Started

brew tap h33ai-postquantum/tap && brew install cachee
cachee init && cachee start

# Store a result with computation fingerprint
SET stark:proof_abc123 "verified:true" FP <fingerprint_hex>

# Retrieve result + fingerprint
GETVERIFIED stark:proof_abc123

# Verify independently (no network, no account)
cachee-verify stark:proof_abc123

Fingerprinting is automatic when using the Cachee SDK. The SDK computes the fingerprint from the computation context, signs the result, and generates the H33-74 receipt -- all in a single API call.

Every cached result should carry proof of how it was produced.

Install Cachee Cache Attestation

Deep Dives

→Cache Attestation: Signed Cache Entries →FHE Caching: Cache Homomorphic Encryption Outputs →ZK Caching: Cache STARK and SNARK Verification →What is Verifiable Computation Caching? →FALCON vs Dilithium vs SPHINCS+: Which to Cache →Post-Quantum Key Sizes Reference

Knowledge Base

Explore Verifiable Computation Infrastructure

Every page in the Cachee knowledge base. Proven computation, not cached data.

→Post-Quantum Caching
The category definition. Run computation once, serve forever. →ZK Proof Caching
Cache STARK and SNARK verification. 294x speedup. →FHE Output Caching
Run encrypted computation once. Cache the result at 31ns. →Cache Attestation
Signed cache entries. Three PQ families per SET. →PQ Key Exchange Caching
ML-KEM at 31ns. Session tickets for post-quantum TLS. →Proof Reuse
Verify once, serve forever. Architecture for verified results. →Cache Bottleneck
Why your cache is slower than your compute. →Redis vs In-Process L1
31ns vs 1ms. The network hop you do not need. →PQ Key Size Reference
Every post-quantum key, ciphertext, and signature size.