FHE Caching
Fully homomorphic encryption is 10,000x slower than plaintext.
Cache the decrypted result. Serve it at 31 nanoseconds.
The layer that eliminates redundant homomorphic computation.
FHE caching stores the results of fully homomorphic encryption computations. FHE operations are 1,000 to 10,000x slower than their plaintext equivalents. A single encrypted inner product on BFV-128 takes roughly 1 millisecond. The decrypted result of that computation, once cached, can be retrieved in 31 nanoseconds. A cached result eliminates the recomputation entirely -- no key generation, no encryption, no homomorphic evaluation, no decryption. The same answer, from cache, in nanoseconds.
100 milliseconds vs 31 nanoseconds
The same computation result. One path re-encrypts, re-evaluates, and re-decrypts. The other remembers.
What Gets Cached: The Output, Not the Ciphertext
This is the most important architectural decision in FHE caching. You do not cache the ciphertext. You cache the decrypted result.
Ciphertexts are enormous. A single BFV-128 ciphertext with N=4096 and one 56-bit modulus is 65,536 bytes. BFV-256 with N=32768 and multiple moduli can exceed 2 megabytes per ciphertext. Caching ciphertexts would consume gigabytes of memory for even modest workloads -- and you would still need to decrypt them on every access.
The decrypted result -- the actual answer -- is typically a few bytes to a few kilobytes. A biometric match verdict is 1 byte. An aggregated statistic is 8 bytes. An ML inference result is a few hundred bytes. That is what gets cached.
Do Not Cache: Ciphertexts
Too large. Still requires decryption on every read. N * M * 8 bytes per ciphertext, where M is the number of RNS moduli. A 10,000-entry cache would consume 650MB to 20GB.
Cache This: Decrypted Results
Tiny. Ready to use. No decryption needed on read. A 10,000-entry cache fits in 5MB. The result is signed by three PQ families and bound to its computation fingerprint.
Five Stages. All Five Eliminated by Caching.
A typical FHE computation has five stages. On a cache hit, none of them execute.
The green sliver is 0.000031% of the red bars above. That is the cached path.
The Computation Fingerprint
The cache key is not a simple string. It is a computation fingerprint -- a cryptographic digest of everything that determines the FHE computation result. If any input changes, the fingerprint changes, and the cache misses. If everything is the same, the fingerprint matches, and the cached result is returned.
fingerprint = SHA3-256(
input_ciphertext_hash // hash of the encrypted input(s)
|| fhe_scheme // BFV or CKKS
|| poly_degree // N (ring dimension: 4096, 8192, 32768)
|| modulus_chain // Q (ciphertext modulus chain)
|| plaintext_modulus // t (65537 for BFV, N/A for CKKS)
|| computation_function // hash of the function applied
|| engine_version // Cachee engine semver
)
The fingerprint binds the cached result to the exact computation that produced it. Two different input ciphertexts produce different fingerprints. The same inputs with different FHE parameters (say, N=4096 vs N=8192) produce different fingerprints. The same everything with a different engine version produces a different fingerprint -- so a buggy engine's results are never served after an upgrade.
FHE Parameter Impact: Why Cache the Output
Ciphertext size grows with security level and ring dimension. The decrypted result stays small. This is why you cache the output, not the ciphertext.
| Scheme | Ring Dim (N) | Security | Ciphertext Size | Typical Computation | Decrypted Result | Cache Impact |
|---|---|---|---|---|---|---|
| BFV-128 | 4,096 | 128-bit | 65 KB (1 modulus) | Biometric match | 1 byte (bool) | 65,000x smaller |
| BFV-256 | 32,768 | 256-bit | 2+ MB (multi-moduli) | Private database query | ~200 bytes (row) | 10,000x smaller |
| CKKS | 8,192 | 128-bit | 130 KB (1 modulus) | ML inference | ~512 bytes (logits) | 260x smaller |
| CKKS (deep) | 32,768 | 128-bit | 4+ MB (multi-level) | Neural network (10+ layers) | ~1 KB (output vector) | 4,000x smaller |
The ratio is always lopsided. FHE ciphertexts are measured in kilobytes to megabytes. Decrypted results are measured in bytes. Caching the output instead of the ciphertext is not an optimization -- it is the only viable architecture.
Before and After
After the first FHE computation, every subsequent identical query is a 31-nanosecond in-process cache lookup. No encryption. No homomorphic evaluation. No decryption. The result is a signed, fingerprinted fact.
How Do You Trust a Cached FHE Result?
The computation fingerprint proves the cached result came from a specific FHE computation with specific parameters. The signatures prove it was not tampered with.
- The fingerprint is deterministic. Anyone can recompute it from the inputs and parameters. If the fingerprint matches, the cache entry was produced by exactly this computation.
- The result is signed. Three independent post-quantum signature families attest the cached result. Forging a cached result requires breaking all three simultaneously.
- The result is independently verifiable. The
cachee-verifytool validates the H33-74 receipt against the signatures and fingerprint with no network call, no Cachee account, and no trust in any third party.
You don't trust the cache. You verify the receipt once, then trust the math.
Where FHE Caching Applies
FHE Cache Flow
Run it yourself: brew install cachee && cachee-fhe-demo
The Cost of Not Caching FHE
FHE is computationally expensive by design. The security comes from the hardness of lattice problems, and evaluating circuits over encrypted data requires polynomial arithmetic in large rings. This cost is unavoidable on the first computation. It is entirely avoidable on every subsequent identical computation.
| Workload | FHE Scheme | Queries/Day | CPU Cost Without Cache | CPU Cost With Cache |
|---|---|---|---|---|
| Biometric Auth | BFV-128 | 1M | 27.8 CPU-hours | 0.15 CPU-hours |
| Private ML | CKKS | 100K | 13.9 CPU-hours | 0.008 CPU-hours |
| Encrypted DB | BFV-256 | 500K | 27.8 CPU-hours | 0.04 CPU-hours |
| Analytics | CKKS | 50K | 6.9 CPU-hours | 0.004 CPU-hours |
Assumes 95% cache hit rate (conservative for stable workloads). At 99% hit rate, the savings are 10x greater. The cache itself costs negligible memory -- decrypted results are bytes, not kilobytes.
BFV vs CKKS: What Changes
BFV (Exact Arithmetic)
BFV operates on integers modulo a plaintext modulus t. The decrypted result is exact -- the same inputs always produce the same output. FHE caching for BFV has 100% correctness: the cached result is bit-for-bit identical to a fresh computation. Ideal for biometric matching, encrypted search, and integer predicates.
CKKS (Approximate Arithmetic)
CKKS operates on approximate real numbers. Decrypted results have a precision bound (typically 40+ bits of accuracy). The fingerprint includes the precision level, and the cached result is valid within that precision. For ML inference and analytics, this precision is far beyond what the application needs.
What the Fingerprint Prevents
Each field in the computation fingerprint exists to prevent a specific class of cache confusion attack. Remove any field, and the cache becomes unsound.
Without input_hash
Different encrypted inputs could share a cache entry. Patient A's diagnosis could be returned for Patient B's query.
Without computation_hash
Different functions could be confused. A "sum" result could be returned for an "average" query on the same data.
Without parameter_hash
Results from different security levels could be mixed. A BFV-128 result (less precise) could be served for a BFV-256 query.
Without engine_version
A result from a buggy prior version could be served after an upgrade. Version pinning ensures old results expire with the old engine.
Without fhe_scheme
A BFV exact integer result could be returned for a CKKS approximate query, or vice versa. Different schemes produce semantically different results.
Without signatures
A cache poisoning attack could insert a forged result. With three PQ signature families, forgery requires breaking all three simultaneously.
Get Started
brew tap h33ai-postquantum/tap && brew install cachee
cachee init && cachee start
# Cache an FHE computation result
SET fhe:biometric_match_user42 "verified:true" FP <fingerprint_hex>
# Retrieve at 31ns -- no FHE recomputation
GETVERIFIED fhe:biometric_match_user42
# Verify the cached result independently
cachee-verify fhe:biometric_match_user42
140+ Redis-compatible commands. Drop-in for existing infrastructure. The FHE pipeline does not change -- you add a cache check before encryption and a cache write after decryption.
Encrypt once. Evaluate once. Cache the result. Serve it forever at 31ns.
Install Cachee Computation CachingDeep Dives
Explore Verifiable Computation Infrastructure
Every page in the Cachee knowledge base. Proven computation, not cached data.
The category definition. Run computation once, serve forever. →ZK Proof Caching
Cache STARK and SNARK verification. 294x speedup. →Computation Fingerprinting
Identity for results. Provenance, not just output. →Cache Attestation
Signed cache entries. Three PQ families per SET. →PQ Key Exchange Caching
ML-KEM at 31ns. Session tickets for post-quantum TLS. →Proof Reuse
Verify once, serve forever. Architecture for verified results. →Cache Bottleneck
Why your cache is slower than your compute. →Redis vs In-Process L1
31ns vs 1ms. The network hop you do not need. →PQ Key Size Reference
Every post-quantum key, ciphertext, and signature size.