Computation Fingerprinting
A cached result is not just a value. It is a provable fact.
The fingerprint binds it to the exact inputs, function, parameters, and engine
that produced it. Same output, different computation? Different cache entry.
A computation fingerprint is a cryptographic identity for a cached result. It is SHA3-256(input_hash || computation_hash || parameter_hash || version || hardware_class). It proves not just what the result is, but how it was produced. Two identical outputs from different computations produce different fingerprints and occupy different cache entries. This is the foundation that makes cached results verifiable -- not just reusable.
Same Output. Different Computation. Different Fingerprint.
This is the distinction that separates computation fingerprinting from every other caching strategy. The output alone does not determine the cache entry. The provenance does.
Both computations produce the same output: "verified: true." But Computation A is a cryptographic proof verified through the FRI protocol. Computation B is a mock that always returns true. If you hash the output, they share a cache entry. With computation fingerprinting, they are distinct entries with distinct cryptographic provenance. An auditor can tell which is which.
Five Fields. Each One Prevents an Attack.
The raw input to the computation -- ciphertext bytes, proof bytes, query parameters, template data. Different inputs always produce different fingerprints, even if the output happens to be the same.
What function was executed? A STARK verifier, a BFV inner product, a PLONK circuit? The computation hash identifies the exact code path. Different functions on the same input produce different fingerprints.
FHE parameters (N, Q, t), STARK configuration (field, blowup factor, num queries), key sizes, security levels. The same function with different parameters produces different fingerprints.
Which engine version produced this result? Engine upgrades that fix bugs or change behavior produce new fingerprints. Old results from old versions are never served after an upgrade.
Is the computation bit-reproducible on any hardware? NearDeterministic (CKKS floating-point) has bounded error. NonDeterministic results (random sampling) are cached but flagged accordingly.
The content address is SHA3-256(primitive || content_hash || fingerprint.digest()). This is the actual key used to store and retrieve the cached result. It incorporates the primitive type and the content itself alongside the fingerprint.
Content Address Construction
The computation fingerprint feeds into the content address -- the final key under which the result is stored and retrieved. The content address includes three components:
// Step 1: Build the computation fingerprint
fingerprint = SHA3-256(
input_hash // SHA3-256 of input data
|| computation_hash // SHA3-256 of function/circuit
|| parameter_hash // SHA3-256 of parameters
|| version // engine name + semver + circuit ID
|| hardware_class // Deterministic | NearDeterministic | NonDeterministic
)
// Step 2: Build the content address (the cache key)
content_address = SHA3-256(
primitive // "stark_verify" | "bfv_inner_product" | "ckks_inference" | ...
|| content_hash // SHA3-256 of the cached result itself
|| fingerprint // the 32-byte digest from Step 1
)
The content address is 32 bytes. It is deterministic -- the same computation with the same inputs, parameters, and version always produces the same content address. It is collision-resistant -- SHA3-256 has 128 bits of collision resistance. And it is fast -- computing the fingerprint and content address takes roughly 40 nanoseconds, negligible compared to the computation it represents.
This Is Not Memoization
Memoization is a convenience. Computation fingerprinting is an audit trail.
Memoization
Memoization caches the result of a pure function given its arguments. It has no concept of versioning, parameter binding, hardware class, or cryptographic attestation. If the engine has a bug and you deploy a fix, memoization serves the old buggy result. If two different functions produce the same output, memoization cannot distinguish them. There is no provenance, no signature, no verifiability.
Computation Fingerprinting
Computation fingerprinting caches "f(x) = y, computed by engine v1.2.3 with parameters P on deterministic hardware, signed by ML-DSA-65 + FALCON-512 + SLH-DSA." The cached result is a provable fact, not a convenience. Upgrade the engine? New fingerprint, new cache entry. Change parameters? New fingerprint. The result carries its own proof of provenance.
| Property | Memoization | Computation Fingerprinting |
|---|---|---|
| Cache key includes version | No | Yes |
| Cache key includes parameters | No | Yes |
| Cache key includes hardware class | No | Yes |
| Result is cryptographically signed | No | Yes (3 PQ families) |
| Independently verifiable | No | Yes (cachee-verify) |
| Distinguishes identical outputs | No | Yes (provenance-based) |
| Survives engine upgrades correctly | No (serves stale) | Yes (new fingerprint) |
| Audit trail | None | Full chain of custody |
Why Every Field Matters
Remove any field from the fingerprint, and the cache becomes unsound. Each field prevents a specific class of attack.
Without input_hash
Different inputs could share a cache entry. Patient A's encrypted diagnosis result could be returned for Patient B's query, because the function, parameters, and version are the same. The input is the only distinguishing factor.
Without computation_hash
Different functions could be confused. A "sum" aggregation result could be returned for an "average" query on the same encrypted dataset. Both operate on the same inputs with the same parameters, but compute fundamentally different things.
Without parameter_hash
Results from weaker security parameters could be served for queries expecting stronger security. A BFV-128 result (lower security) could be returned for a BFV-256 query (higher security). The client believes they have 256-bit security when they do not.
Without version
An old buggy engine's results could be served after an upgrade. If engine v1.2.2 had a rounding error that produced incorrect BFV decryptions, those incorrect results would persist in cache and be served to v1.2.3 clients indefinitely.
Without hardware_class
A non-deterministic result (from randomized sampling) could be treated as deterministic. Or a NearDeterministic CKKS result (with bounded floating-point error) could be treated as exact. The consumer cannot assess result reliability.
Without content_hash
If the content address does not include a hash of the result itself, a cache poisoning attack could replace the stored value with a forged one while keeping the same fingerprint. The content hash binds the address to the actual bytes stored.
How It Works in Cachee
Computation fingerprinting is built into Cachee's core operations. You do not compute fingerprints manually. The Cachee SDK and CLI handle fingerprint creation, storage, and verification.
SET with Fingerprint
When you SET a value with a computation fingerprint, Cachee stores the result, signs it with three PQ families, and generates the H33-74 receipt. The fingerprint is embedded in the receipt and used as the content address.
GETVERIFIED
GETVERIFIED retrieves the cached result and returns the computation fingerprint alongside it. The consumer can see exactly which computation produced this result -- the inputs, function, parameters, version, and hardware class.
cachee-verify
The cachee-verify CLI tool checks the fingerprint against the H33-74 receipt and the three PQ signatures. It verifies that the stored result matches the fingerprint, that the signatures are valid, and that no tampering has occurred. No network call. No Cachee account. No trust in any third party.
Fingerprint Creation and Verification
Run it yourself: brew install cachee && cachee fp create --help
Where Fingerprinting Matters
Hardware Class: Why It Exists
Not all computations are bit-reproducible. The hardware class field classifies each cached result by its reproducibility guarantee, so consumers know exactly what they are getting.
| Hardware Class | Meaning | Examples | Cache Behavior |
|---|---|---|---|
| Deterministic | Bit-for-bit identical on any hardware | BFV FHE, STARK verification, SHA3 hashing | Full caching -- result is exact |
| NearDeterministic | Bounded error, reproducible within precision | CKKS FHE, floating-point ML inference | Cached with precision metadata |
| NonDeterministic | Result depends on randomness or hardware timing | Monte Carlo sampling, random encryption nonces | Cached but flagged -- consumer decides |
The hardware class is set by the computation engine, not by the user. When a BFV inner product is computed, the engine automatically sets hardware_class to Deterministic because BFV integer arithmetic is exact. When a CKKS inference is computed, it is set to NearDeterministic with a precision bound attached. This metadata travels with the cached result and is exposed to consumers via GETVERIFIED.
Get Started
brew tap h33ai-postquantum/tap && brew install cachee
cachee init && cachee start
# Store a result with computation fingerprint
SET stark:proof_abc123 "verified:true" FP <fingerprint_hex>
# Retrieve result + fingerprint
GETVERIFIED stark:proof_abc123
# Verify independently (no network, no account)
cachee-verify stark:proof_abc123
Fingerprinting is automatic when using the Cachee SDK. The SDK computes the fingerprint from the computation context, signs the result, and generates the H33-74 receipt -- all in a single API call.