NIST Level 1 Key Caching: FALCON-512 + ML-KEM-512

May 1, 2026 | 15 min read | Engineering

Post-quantum cryptography is often discussed in terms of its worst-case sizes. ML-DSA-87 signatures at 4,627 bytes. ML-KEM-1024 ciphertexts at 1,568 bytes. These numbers are real, and they are the right choice for certain applications. But they are not the only choice. NIST Level 1 exists for a reason: it provides 128-bit classical security equivalent with the smallest possible post-quantum footprint, and for a large category of systems, it is exactly sufficient.

The lightest viable post-quantum bundle combines FALCON-512 for signatures and ML-KEM-512 for key encapsulation. FALCON-512 produces 690-byte signatures with 897-byte public keys. ML-KEM-512 uses 800-byte public keys and 768-byte ciphertexts. Combined, a single session requires 1,490 bytes of cryptographic material. Compare that to classical ECDSA-P256 plus X25519: a 64-byte signature, a 32-byte public key for signing, and a 32-byte Diffie-Hellman share. That is 96 bytes total. The post-quantum Level 1 bundle is 15.5x larger than its classical equivalent. That is the smallest PQ multiplier available. And it matters enormously for cache architecture.

1,490 B

PQ Level 1 per session

96 B

Classical per session

15.5x

Size multiplier

What NIST Level 1 Actually Means

NIST defines five security levels for post-quantum cryptographic algorithms. Level 1 is the baseline. It requires that breaking the algorithm is at least as hard as performing a key search on AES-128. In concrete terms, this means approximately 2^128 operations for a classical attacker and approximately 2^64 operations for a quantum attacker using Grover's algorithm. For context, 2^128 classical operations is beyond the reach of any computing infrastructure that will exist in the foreseeable future. It is the same security level that protects most of the internet today via AES-128 in TLS 1.3.

The "Level 1" designation does not mean "weak." It means "equivalent to AES-128." Every HTTPS connection you have ever made to a website using TLS 1.3 with AES-128-GCM operates at this security level for symmetric encryption. The post-quantum algorithms at Level 1 are designed to provide the same security margin against both classical and quantum attackers. The difference between Level 1 and Level 5 is not whether the algorithm can be broken, but how large of a quantum computer would be required, and how many logical qubits and gate operations the attack demands.

Level 1 security is defined relative to AES-128 key search. Level 3 is defined relative to AES-192 key search. Level 5 is defined relative to AES-256 key search. The practical distinction matters for threat modeling: if your data has a secrecy lifetime of 10 years and you believe a cryptographically relevant quantum computer will exist within 10 years, you might choose Level 5 for defense in depth. If your data has a secrecy lifetime of minutes (session tokens, ephemeral API keys, development environments), Level 1 provides more than sufficient protection.

The FALCON-512 Component

FALCON is a post-quantum digital signature scheme based on NTRU lattices. It was selected by NIST as a standard signature algorithm alongside ML-DSA (Dilithium) and SLH-DSA (SPHINCS+). FALCON's distinguishing characteristic is compact signatures. At Level 1, FALCON-512 produces signatures that are 690 bytes, compared to ML-DSA-44's 2,420 bytes and SLH-DSA-128f's 17,088 bytes. For signature-heavy workloads, FALCON-512 is the clear winner on size.

The key sizes for FALCON-512 are 897 bytes for the public key and 1,281 bytes for the private key. The public key is larger than ML-DSA-44's 1,312-byte public key in absolute terms, but the signature size advantage more than compensates in any system that transmits or caches more signatures than public keys. In a typical session-based authentication flow, the server caches the public key once per user and verifies multiple signatures per session. The per-verification cache overhead is dominated by signature size, not public key size.

FALCON's mathematical foundation is the NTRU lattice problem, specifically the short integer solution (SIS) problem over NTRU-structured lattices. This is a different hardness assumption from ML-DSA, which relies on the Module Learning With Errors (MLWE) problem. Using FALCON means your system's security rests on a different mathematical foundation than if you used ML-DSA. This distinction becomes critical when you combine multiple signature schemes for defense in depth: breaking the combined attestation requires breaking both NTRU lattices and module lattices, which are independent mathematical bets.

FALCON-512 Performance Characteristics

FALCON signing is more complex than ML-DSA signing because it requires sampling from a discrete Gaussian distribution using a trapdoor sampler. This operation must be performed in constant time to avoid side-channel attacks, which makes the implementation more delicate than ML-DSA's simpler rejection sampling approach. Key generation for FALCON-512 takes approximately 8-12 milliseconds on modern hardware, compared to ML-DSA-44's sub-millisecond keygen. Signing takes approximately 0.5-2 milliseconds, and verification takes approximately 0.1-0.3 milliseconds.

The verification speed is what matters for caching. When you cache FALCON-512 verification results, you eliminate the 0.1-0.3 millisecond verification cost on cache hits. At 100,000 verifications per second with a 90% cache hit rate, you save 9-27 seconds of cumulative CPU time per second. The cached lookup takes approximately 35 nanoseconds for an in-process hash map. The verification result is a single bit: valid or invalid. The cache entry overhead is the SHA3-256 fingerprint (32 bytes) plus the result byte plus metadata, totaling approximately 41 bytes per cached verification.

The ML-KEM-512 Component

ML-KEM (formerly known as CRYSTALS-Kyber) is the NIST-selected post-quantum key encapsulation mechanism. It is the only KEM standardized by NIST, making it the default choice for post-quantum key exchange and key establishment. ML-KEM-512 is the Level 1 variant with an 800-byte public key, an 800-byte ciphertext, and a 32-byte shared secret.

The 800-byte public key is the primary cache concern for ML-KEM-512. In a TLS-like handshake, the server sends its public key to the client, the client encapsulates a shared secret using the public key, and the server decapsulates using its private key. The public key must be available at the start of every connection. In a system handling 100,000 new connections per second, that is 100,000 public key retrievals per second. If the public key is fetched from a remote store on every connection, the 800-byte key plus network overhead adds latency to every handshake. Caching the public key in-process eliminates this latency entirely.

The ciphertext size matters for session storage. Each ML-KEM-512 encapsulation produces a 768-byte ciphertext that the server must receive and process. If you are logging or storing session establishment records, the 768-byte ciphertext per session adds up. At 1 million sessions per day, that is 768 MB of ciphertext data per day just for key exchange records. This is manageable but notable compared to the 32-byte Diffie-Hellman shares in classical X25519.

Combined Per-Session Overhead: The Memory Math

A single authenticated session using the FALCON-512 + ML-KEM-512 bundle requires caching the following material: the FALCON-512 public key (897 bytes) for signature verification throughout the session, the most recent FALCON-512 signature (690 bytes) for the current session token or attestation, the ML-KEM-512 encapsulation ciphertext (768 bytes) for the session key establishment, and session metadata (approximately 135 bytes for timestamps, identifiers, and state). The total per-session cache footprint is approximately 2,490 bytes, though the core cryptographic material is 1,490 bytes.

At scale, these numbers tell a clear story about what cache architecture is viable.

Active Sessions	PQ Level 1 (1,490 B)	Classical (96 B)	PQ Multiplier
10,000	14.9 MB	0.96 MB	15.5x
100,000	149 MB	9.6 MB	15.5x
1,000,000	1.49 GB	96 MB	15.5x
10,000,000	14.9 GB	960 MB	15.5x

One million sessions at Level 1 require 1.49 GB of cache for cryptographic material alone. This is significant but well within the capacity of modern server memory. A typical production server has 32-256 GB of RAM. Dedicating 1.49 GB to session cache is reasonable. At 10 million sessions, 14.9 GB is still feasible on high-memory instances but starts to compete with application memory for the working set.

Compare these numbers to Level 3 and Level 5 to understand why Level 1 is attractive for high-session-count deployments. ML-KEM-768 plus ML-DSA-65 at Level 3 requires approximately 4,493 bytes per session, roughly 3x larger than Level 1. At 1 million sessions, that is 4.49 GB. ML-KEM-1024 plus ML-DSA-87 at Level 5 requires approximately 6,195 bytes per session, roughly 4.2x larger than Level 1. At 1 million sessions, that is 6.2 GB. The choice of security level directly determines your cache memory requirements, and Level 1 is the only level where the memory footprint remains comparable to classical systems.

Level 1 Memory Advantage

At 1 million concurrent sessions, the Level 1 bundle (FALCON-512 + ML-KEM-512) requires 1.49 GB of cache memory. Level 3 requires 4.49 GB (3x more). Level 5 requires 6.2 GB (4.2x more). For systems with high session counts and moderate security requirements, Level 1 is the only PQ bundle where cache memory does not become a primary infrastructure concern.

Cache Architecture for Level 1

The 15.5x size increase from classical to Level 1 post-quantum is small enough that most existing cache architectures remain viable without fundamental redesign. This is the core advantage of Level 1: it is the only PQ security level where you can often use the same cache infrastructure you use today, just with more memory provisioned.

In-Process L1 Cache

An in-process hash map (DashMap, Rust's standard HashMap behind a RwLock, or equivalent) handles Level 1 material comfortably. At 1 million sessions with 2,490 bytes per entry (including metadata), the total memory footprint is approximately 2.49 GB. This fits in the L3 cache of modern server CPUs for the hot subset and in main memory for the full set. Lookup latency is 30-50 nanoseconds for an in-process hash map, regardless of value size, because the hash computation dominates and the value is returned as a pointer, not a copy.

The in-process approach eliminates all network overhead. There is no serialization, no TCP round-trip, no connection pool management. The session material is in the same address space as the application. For FALCON-512 verification, the public key is read directly from the cache into the verification function without any intermediate copies. This is the fastest possible path from cache to cryptographic operation.

Network-Attached Cache

At Level 1 sizes, even network-attached caches like Redis or Memcached remain viable, though not optimal. A single GET for a 2,490-byte value over a local network takes approximately 0.3-0.5 milliseconds: about 0.1 milliseconds for the network round-trip plus 0.1-0.3 milliseconds for serialization and protocol overhead. This is 10,000x slower than an in-process lookup but still fast enough for most applications that are not latency-critical below the millisecond level.

The viability of network-attached caches at Level 1 means that teams migrating from classical to post-quantum cryptography can often keep their existing cache infrastructure for the initial migration. They do not need to re-architect their cache layer on day one. They need to provision 15.5x more memory in their cache cluster, adjust eviction policies for the larger values, and monitor for increased network bandwidth between application servers and cache nodes. These are operational changes, not architectural changes.

However, for systems that already operate at the boundary of cache performance with classical key sizes, Level 1 will push them over the edge. If your Redis cluster is already at 80% memory utilization with classical session data, multiplying the cryptographic material by 15.5x will not fit without adding nodes. Planning for this capacity increase before the migration is essential.

Eviction Policy Considerations

The eviction policy for Level 1 session caches should prioritize frequency over recency. CacheeLFU tracks how often each entry is accessed and evicts the least frequently used entries first. This is superior to LRU for session data because session access patterns are bimodal: active sessions are accessed many times per minute (signature verification on every request), while expired or idle sessions are accessed rarely. LRU would keep recently created but immediately abandoned sessions in cache while evicting frequently accessed active sessions during capacity pressure. CacheeLFU correctly identifies active sessions as high-value and retains them.

The eviction cost at Level 1 is lower than at higher security levels because re-fetching an evicted Level 1 entry is cheaper. A FALCON-512 public key is 897 bytes, which can be fetched from a database or key server in under a millisecond. At Level 5, re-fetching an ML-DSA-87 public key (2,592 bytes) plus re-verifying the associated certificate chain is more expensive. This means Level 1 caches can tolerate higher eviction rates without significant performance degradation, which in turn means you can size the cache more aggressively (smaller) without unacceptable miss penalties.

Where Level 1 Is Sufficient

Not every system needs Level 5 security. The security level should match the data's sensitivity and secrecy lifetime. Level 1 provides 128-bit classical security, which is the same security margin as AES-128, and is appropriate for several common deployment scenarios.

Internal API Authentication

Services communicating within a private network (VPC, data center, Kubernetes cluster) face a different threat model than public-facing APIs. The attacker must first compromise the network to observe or tamper with internal traffic. Post-quantum protection at Level 1 ensures that even if the attacker records encrypted internal traffic today and later gains access to a quantum computer, they cannot decrypt the recorded sessions. The 128-bit quantum security margin is sufficient because the data's secrecy lifetime is typically minutes to hours, not decades.

Session Tokens for Non-Critical Systems

Session tokens for internal dashboards, development tools, content management systems, and similar applications do not protect data with multi-decade secrecy requirements. The token authenticates the user for the duration of the session, typically 15 minutes to 24 hours. An attacker who breaks the session token's cryptographic protection after the session expires gains nothing. Level 1 provides ample security for these short-lived credentials.

IoT Device Attestation

IoT devices have severe constraints on computation, memory, and bandwidth. FALCON-512 signatures at 690 bytes are small enough to fit in constrained network packets. ML-KEM-512 key exchange at 800-byte public keys is feasible even on devices with limited memory. Level 1 is often the only viable PQ security level for resource-constrained IoT devices, and the 128-bit security margin is appropriate for device attestation flows where the attestation is verified within seconds of generation.

Development and Staging Environments

Development and staging environments should use post-quantum cryptography to ensure that developers are testing against the same cryptographic stack that will run in production. But these environments do not protect production data. Level 1 provides the PQ algorithms and the cache behavior patterns without the memory overhead of Level 3 or Level 5. Developers can test cache eviction, key rotation, and session management with realistic PQ key sizes at a fraction of the infrastructure cost.

Comparison: Level 1 vs Level 3 vs Level 5

The choice between security levels is ultimately a choice about cache architecture. Each level increase multiplies the per-session footprint, and that multiplication propagates through every layer of the cache hierarchy.

Component	Level 1	Level 3	Level 5
KEM public key	800 B (ML-KEM-512)	1,184 B (ML-KEM-768)	1,568 B (ML-KEM-1024)
KEM ciphertext	768 B (ML-KEM-512)	1,088 B (ML-KEM-768)	1,568 B (ML-KEM-1024)
Sig public key	897 B (FALCON-512)	1,952 B (ML-DSA-65)	2,592 B (ML-DSA-87)
Signature	690 B (FALCON-512)	3,309 B (ML-DSA-65)	4,627 B (ML-DSA-87)
Per-session total	1,490 B	4,493 B	6,195 B
vs classical (96 B)	15.5x	46.8x	64.5x
1M sessions	1.49 GB	4.49 GB	6.2 GB
10M sessions	14.9 GB	44.9 GB	62 GB

The table makes the architectural implication clear. At Level 1 with 10 million sessions, you need 14.9 GB -- a single high-memory server or a small cache cluster. At Level 3, 44.9 GB requires a dedicated cache cluster with multiple nodes. At Level 5, 62 GB is pushing the limits of single-node memory and absolutely requires distributed caching infrastructure. The security level you choose determines not just your cryptographic security margin but your cache infrastructure costs.

The Recommendation: Tiered Security Levels

The optimal approach is not to pick one security level for everything. It is to tier your security levels based on data sensitivity and cache architecture constraints. This tiered approach lets you minimize cache memory where Level 1 is sufficient while providing maximum protection where Level 5 is required.

Level 1 for internal services. Service-to-service authentication within your private network, internal API tokens, development environments, staging environments, and any system where the data's secrecy lifetime is measured in hours, not years. The 1,490-byte per-session footprint keeps cache memory manageable even at millions of concurrent sessions.

Level 3 for external APIs. Customer-facing APIs, partner integrations, OAuth token issuance, and any system where the data crosses a trust boundary. Level 3 provides a higher security margin at 3x the cache cost of Level 1. At millions of concurrent sessions, this requires dedicated cache infrastructure but is operationally feasible with modern hardware.

Level 5 for long-lived credentials. Root certificates, code signing keys, document signatures that must be verifiable for decades, financial transaction records, and any credential subject to CNSA 2.0 requirements. The 6,195-byte per-session footprint is expensive to cache but these use cases typically have lower session counts and longer lifetimes, making the cache size manageable.

Implementation: Cachee with FALCON-512 + ML-KEM-512

The Cachee CLI supports NIST Level 1 configuration out of the box. When initialized with Level 1 parameters, Cachee automatically adjusts its memory allocation, eviction thresholds, and cache line sizes for the 1,490-byte per-entry footprint. The cache is backed by an in-process DashMap with CacheeLFU eviction, providing 30-50 nanosecond lookups regardless of the cached value size.

# Initialize Cachee for NIST Level 1 sessions
cachee init --pq-level 1 --capacity 1000000

# Cache a FALCON-512 public key
cachee set session:user:12345:pk <897-byte-public-key> --ttl 3600

# Retrieve for verification (35ns in-process)
cachee get session:user:12345:pk

# Monitor Level 1 cache metrics
cachee status --level-1

# Output:
# NIST Level 1 session cache:
#   Entries:    847,293 / 1,000,000
#   Memory:     1.26 GB / 1.49 GB
#   Hit rate:   94.7%
#   Avg hit:    35ns
#   Eviction:   CacheeLFU
#   PQ bundle:  FALCON-512 + ML-KEM-512

The cache supports TTL-based expiration aligned with session lifetimes. FALCON-512 public keys are cached for the duration of the session or the key's validity period, whichever is shorter. ML-KEM-512 encapsulation records are cached for audit and replay protection. Verification results (valid/invalid) are cached separately with their own TTL, enabling the 294x verification speedup described in our STARK verification optimization guide.

Migration Path from Classical

Migrating from classical to Level 1 post-quantum cryptography in the cache layer is a three-step process. First, inventory your current cache utilization: how many session entries, how many bytes per entry, what is your memory headroom. Second, multiply the cryptographic material portion of each entry by 15.5x and verify that the new total fits within your available memory or within a reasonable memory upgrade budget. Third, update the cache value schemas to accommodate the larger key and signature sizes, deploy, and monitor eviction rates and hit ratios to ensure the CacheeLFU policy is performing correctly with the new value sizes.

The 15.5x multiplier at Level 1 is the most manageable PQ migration. If your current cache uses 100 MB for classical session cryptographic material, you need 1.55 GB for Level 1. If you have 4 GB of cache memory available, you can migrate without any infrastructure changes. This ease of migration is the strongest practical argument for Level 1 in environments where cache infrastructure is already deployed and operational.

The Bottom Line

NIST Level 1 with FALCON-512 + ML-KEM-512 is the lightest post-quantum bundle at 1,490 bytes per session, a 15.5x increase over classical. It provides 128-bit security equivalent, sufficient for internal APIs, session tokens, IoT attestation, and development environments. At this size, in-process caching handles millions of sessions comfortably, and even network-attached caches remain viable. Start with Level 1 for internal services, escalate to Level 3 for external APIs, and reserve Level 5 for long-lived credentials. The security level you choose is a cache architecture decision as much as a cryptographic one.

Post-quantum session caching at 35 nanoseconds. Start with Level 1.

brew install cachee PQ Key Size Reference