BFV CKKS Post-Quantum Signed 10,000x Slowdown Eliminated

FHE Caching

Fully homomorphic encryption is 10,000x slower than plaintext.
Cache the decrypted result. Serve it at 31 nanoseconds.
The layer that eliminates redundant homomorphic computation.

10,000x
FHE Slowdown vs Plaintext
31ns
Cached Result
100%
Accuracy (Same Computation)
3
PQ Families Signed
Definition

FHE caching stores the results of fully homomorphic encryption computations. FHE operations are 1,000 to 10,000x slower than their plaintext equivalents. A single encrypted inner product on BFV-128 takes roughly 1 millisecond. The decrypted result of that computation, once cached, can be retrieved in 31 nanoseconds. A cached result eliminates the recomputation entirely -- no key generation, no encryption, no homomorphic evaluation, no decryption. The same answer, from cache, in nanoseconds.

The Gap

100 milliseconds vs 31 nanoseconds

The same computation result. One path re-encrypts, re-evaluates, and re-decrypts. The other remembers.

FHE Computation (encrypt + evaluate + decrypt) 100,000,000 ns
Re-compute every time
Cached Result (hash lookup + pointer dereference) 31 ns
3,225,806x
No encryption. No homomorphic evaluation. No decryption. Just the answer, from cache.
Critical Distinction

What Gets Cached: The Output, Not the Ciphertext

This is the most important architectural decision in FHE caching. You do not cache the ciphertext. You cache the decrypted result.

Ciphertexts are enormous. A single BFV-128 ciphertext with N=4096 and one 56-bit modulus is 65,536 bytes. BFV-256 with N=32768 and multiple moduli can exceed 2 megabytes per ciphertext. Caching ciphertexts would consume gigabytes of memory for even modest workloads -- and you would still need to decrypt them on every access.

The decrypted result -- the actual answer -- is typically a few bytes to a few kilobytes. A biometric match verdict is 1 byte. An aggregated statistic is 8 bytes. An ML inference result is a few hundred bytes. That is what gets cached.

Do Not Cache: Ciphertexts

65KB+
Per ciphertext (BFV-128, N=4096)

Too large. Still requires decryption on every read. N * M * 8 bytes per ciphertext, where M is the number of RNS moduli. A 10,000-entry cache would consume 650MB to 20GB.

Cache This: Decrypted Results

1-512B
Per cached result

Tiny. Ready to use. No decryption needed on read. A 10,000-entry cache fits in 5MB. The result is signed by three PQ families and bound to its computation fingerprint.

FHE Pipeline Deep Dive

Five Stages. All Five Eliminated by Caching.

A typical FHE computation has five stages. On a cache hit, none of them execute.

Key generation15 ms (15%)
Encryption (NTT + sampling)20 ms (20%)
Homomorphic evaluation45 ms (45%)
Key switching / relinearization10 ms (10%)
Decryption (INTT + rounding)10 ms (10%)
Total FHE Computation 100 ms
CACHED: 31 ns

The green sliver is 0.000031% of the red bars above. That is the cached path.

Identity

The Computation Fingerprint

The cache key is not a simple string. It is a computation fingerprint -- a cryptographic digest of everything that determines the FHE computation result. If any input changes, the fingerprint changes, and the cache misses. If everything is the same, the fingerprint matches, and the cached result is returned.

fingerprint = SHA3-256( input_ciphertext_hash // hash of the encrypted input(s) || fhe_scheme // BFV or CKKS || poly_degree // N (ring dimension: 4096, 8192, 32768) || modulus_chain // Q (ciphertext modulus chain) || plaintext_modulus // t (65537 for BFV, N/A for CKKS) || computation_function // hash of the function applied || engine_version // Cachee engine semver )

The fingerprint binds the cached result to the exact computation that produced it. Two different input ciphertexts produce different fingerprints. The same inputs with different FHE parameters (say, N=4096 vs N=8192) produce different fingerprints. The same everything with a different engine version produces a different fingerprint -- so a buggy engine's results are never served after an upgrade.

Why Size Matters

FHE Parameter Impact: Why Cache the Output

Ciphertext size grows with security level and ring dimension. The decrypted result stays small. This is why you cache the output, not the ciphertext.

SchemeRing Dim (N)SecurityCiphertext SizeTypical ComputationDecrypted ResultCache Impact
BFV-128 4,096 128-bit 65 KB (1 modulus) Biometric match 1 byte (bool) 65,000x smaller
BFV-256 32,768 256-bit 2+ MB (multi-moduli) Private database query ~200 bytes (row) 10,000x smaller
CKKS 8,192 128-bit 130 KB (1 modulus) ML inference ~512 bytes (logits) 260x smaller
CKKS (deep) 32,768 128-bit 4+ MB (multi-level) Neural network (10+ layers) ~1 KB (output vector) 4,000x smaller

The ratio is always lopsided. FHE ciphertexts are measured in kilobytes to megabytes. Decrypted results are measured in bytes. Caching the output instead of the ciphertext is not an optimization -- it is the only viable architecture.

Architecture

Before and After

Without FHE Caching
Plaintext input arrives
Generate keys (if needed)
Encrypt input (NTT + CBD sampling)
Homomorphic evaluation (45ms+)
Decrypt result (INTT + rounding)
Return plaintext result
↻ Repeat on every request. Same input. Same result. Same 100ms cost.
With FHE Caching
Input arrives
Compute fingerprint (SHA3-256)
Check cache (31ns)
Hit? Return instantly. Miss? Full FHE pipeline, then cache result.
Encrypt once. Evaluate once. Cache forever.

After the first FHE computation, every subsequent identical query is a 31-nanosecond in-process cache lookup. No encryption. No homomorphic evaluation. No decryption. The result is a signed, fingerprinted fact.

Trust Model

How Do You Trust a Cached FHE Result?

The computation fingerprint proves the cached result came from a specific FHE computation with specific parameters. The signatures prove it was not tampered with.

🔒
ML-DSA-65
Lattice-based (MLWE). NIST standard. Successor to Dilithium. 3,309-byte signatures.
🔐
FALCON-512
NTRU lattice-based. Compact signatures (656 bytes). Distinct mathematical basis from ML-DSA.
🔑
SLH-DSA
Stateless hash-based. No lattice assumptions. 17,088-byte signatures. Minimal attack surface.
Three independent hardness assumptions
  • The fingerprint is deterministic. Anyone can recompute it from the inputs and parameters. If the fingerprint matches, the cache entry was produced by exactly this computation.
  • The result is signed. Three independent post-quantum signature families attest the cached result. Forging a cached result requires breaking all three simultaneously.
  • The result is independently verifiable. The cachee-verify tool validates the H33-74 receipt against the signatures and fingerprint with no network call, no Cachee account, and no trust in any third party.

You don't trust the cache. You verify the receipt once, then trust the math.

Applications

Where FHE Caching Applies

🧠
Private ML Inference
Run a neural network on encrypted patient data. The inference takes 500ms. The same patient's re-query hits cache at 31ns. No model or data exposed.
🗃
Encrypted Database Queries
Query an encrypted database without decrypting it. The homomorphic query takes 200ms. Repeated queries for the same predicate are served from cache.
📊
Privacy-Preserving Analytics
Compute aggregations over encrypted data -- sum, average, count -- without seeing individual records. Cache the aggregated result for dashboard refreshes.
👤
Biometric Matching
Compare an encrypted biometric template against encrypted enrolled templates. The FHE inner product takes 1ms per comparison. Cache the match result permanently.
🛡
Confidential Computing
Compute on encrypted data from multiple parties without a trusted third party. Cache the joint computation result so re-runs never re-encrypt.
💰
Encrypted Financial Analysis
Risk scoring, credit checks, fraud detection on encrypted financial records. Same client, same model, same result -- served from cache on repeat queries.
Live Demo

FHE Cache Flow

cachee-fhe-demo
[1] Input arrives: biometric template, 128 dimensions
[2] Compute fingerprint: SHA3-256(input_hash || BFV || N=4096 || Q=56bit || t=65537 || fn=inner_product || v1.2.3)
[3] Cache lookup: MISS
[4] FHE pipeline: encrypt 20ms + evaluate 45ms + decrypt 10ms = 75ms total
[5] Cache result: signed ML-DSA-65 + FALCON-512 + SLH-DSA
[6] H33-74 receipt: 58 bytes stored
 
[7] Same input arrives again
[8] Cache lookup: HIT -- 31ns
    2,419,354x faster. Zero FHE recomputation.

Run it yourself: brew install cachee && cachee-fhe-demo

The Economics

The Cost of Not Caching FHE

FHE is computationally expensive by design. The security comes from the hardness of lattice problems, and evaluating circuits over encrypted data requires polynomial arithmetic in large rings. This cost is unavoidable on the first computation. It is entirely avoidable on every subsequent identical computation.

WorkloadFHE SchemeQueries/DayCPU Cost Without CacheCPU Cost With Cache
Biometric AuthBFV-1281M27.8 CPU-hours0.15 CPU-hours
Private MLCKKS100K13.9 CPU-hours0.008 CPU-hours
Encrypted DBBFV-256500K27.8 CPU-hours0.04 CPU-hours
AnalyticsCKKS50K6.9 CPU-hours0.004 CPU-hours

Assumes 95% cache hit rate (conservative for stable workloads). At 99% hit rate, the savings are 10x greater. The cache itself costs negligible memory -- decrypted results are bytes, not kilobytes.

Scheme Comparison

BFV vs CKKS: What Changes

BFV (Exact Arithmetic)

Exact
Integer computations, perfect caching

BFV operates on integers modulo a plaintext modulus t. The decrypted result is exact -- the same inputs always produce the same output. FHE caching for BFV has 100% correctness: the cached result is bit-for-bit identical to a fresh computation. Ideal for biometric matching, encrypted search, and integer predicates.

CKKS (Approximate Arithmetic)

~2-40
Floating-point, precision-bounded caching

CKKS operates on approximate real numbers. Decrypted results have a precision bound (typically 40+ bits of accuracy). The fingerprint includes the precision level, and the cached result is valid within that precision. For ML inference and analytics, this precision is far beyond what the application needs.

Security

What the Fingerprint Prevents

Each field in the computation fingerprint exists to prevent a specific class of cache confusion attack. Remove any field, and the cache becomes unsound.

Without input_hash

Different encrypted inputs could share a cache entry. Patient A's diagnosis could be returned for Patient B's query.

Without computation_hash

Different functions could be confused. A "sum" result could be returned for an "average" query on the same data.

Without parameter_hash

Results from different security levels could be mixed. A BFV-128 result (less precise) could be served for a BFV-256 query.

Without engine_version

A result from a buggy prior version could be served after an upgrade. Version pinning ensures old results expire with the old engine.

Without fhe_scheme

A BFV exact integer result could be returned for a CKKS approximate query, or vice versa. Different schemes produce semantically different results.

Without signatures

A cache poisoning attack could insert a forged result. With three PQ signature families, forgery requires breaking all three simultaneously.

Install

Get Started

brew tap h33ai-postquantum/tap && brew install cachee cachee init && cachee start # Cache an FHE computation result SET fhe:biometric_match_user42 "verified:true" FP <fingerprint_hex> # Retrieve at 31ns -- no FHE recomputation GETVERIFIED fhe:biometric_match_user42 # Verify the cached result independently cachee-verify fhe:biometric_match_user42

140+ Redis-compatible commands. Drop-in for existing infrastructure. The FHE pipeline does not change -- you add a cache check before encryption and a cache write after decryption.

Encrypt once. Evaluate once. Cache the result. Serve it forever at 31ns.

Install Cachee Computation Caching

Deep Dives

Knowledge Base

Explore Verifiable Computation Infrastructure

Every page in the Cachee knowledge base. Proven computation, not cached data.

Post-Quantum Caching
The category definition. Run computation once, serve forever.
ZK Proof Caching
Cache STARK and SNARK verification. 294x speedup.
Computation Fingerprinting
Identity for results. Provenance, not just output.
Cache Attestation
Signed cache entries. Three PQ families per SET.
PQ Key Exchange Caching
ML-KEM at 31ns. Session tickets for post-quantum TLS.
Proof Reuse
Verify once, serve forever. Architecture for verified results.
Cache Bottleneck
Why your cache is slower than your compute.
Redis vs In-Process L1
31ns vs 1ms. The network hop you do not need.
PQ Key Size Reference
Every post-quantum key, ciphertext, and signature size.