Session Mobility

Your sessions are trapped on one server.
We set them free.

Stateful workloads — crypto keypairs, ML contexts, user sessions — pin you to a single instance. Cachee's tiered L1/L2 architecture makes any session available on any instance at sub-microsecond latency. Scale horizontally without losing state.

0.085 µs
L1 Session Lookup
< 1 ms
L2 Cold Session Load
Horizontal Instances
The Problem

Stateful sessions kill horizontal scaling

When session state lives in local memory, your load balancer becomes a single point of failure. One instance goes down — every session on it is gone.

Without Cachee

Customer hits Instance A, which generates a crypto keypair and stores it in a local HashMap. When the load balancer routes the next request to Instance B — session not found, 404. You're stuck with sticky sessions, which means you can't auto-scale, can't do rolling deploys, and one instance failure cascades to every session it held.

With Cachee

Session state writes to Cachee (L2 Redis) on creation and caches locally (L1 DashMap) for sub-microsecond access. Any instance can load any session on first access — then it's cached locally for every subsequent call. Auto-scale freely. Deploy without draining. Lose an instance and nothing is lost.

Architecture

Two-tier session mobility, zero code change

Cachee speaks the Redis protocol (RESP). Point your session store at Cachee instead of raw Redis — you get L1 caching, connection pooling, and circuit breaking for free.

┌──────────────────────────────────────┐ Cachee / ElastiCache (L2) KEY: "session:{id}" VALUE: serialized state (350KB-2MB) TTL: configurable (1h default) └────────────┬─────────────────────────┘ ┌──────────────────┼──────────────────┐ │ │ │ Instance A Instance B Instance C ┌──────────┐ ┌──────────┐ ┌──────────┐ L1 Cache L1 Cache L1 Cache 0.085 µs 0.085 µs 0.085 µs DashMap DashMap DashMap └──────────┘ └──────────┘ └──────────┘ Session created on A → Stored in L2 → Accessed on B → Cached in B's L1 Every subsequent request on B hits local L1 at 0.085 µs. Zero network hops.
How It Works

Session lifecycle in four steps

From creation to cross-instance access — the entire flow is transparent to your application code.

1

Session Created

Your app generates session state (keypairs, model context, user data) and writes it to Cachee with a TTL. Cachee stores it in L2 (Redis) and caches it in the local L1.

SET + L1 cache: ~0.5 ms
2

Same Instance — L1 Hit

Subsequent requests routed to the same instance find the session in the L1 DashMap. No network round-trip. No deserialization overhead.

L1 lookup: 0.085 µs
3

Different Instance — L2 Hit

Load balancer routes to a different instance. L1 miss triggers an L2 (Redis) fetch. The session is deserialized and promoted to the new instance's L1 for all future requests.

L2 fetch + promote: < 1 ms
4

Session Expires — Clean TTL

When TTL expires, the session is automatically evicted from both L1 and L2. No zombie state. No manual cleanup. SETEX semantics you already know.

TTL-based lifecycle
Comparison

Cachee vs. raw Redis vs. sticky sessions

Three approaches to session state at scale — only one gives you sub-microsecond hot reads with zero code changes.

Sticky Sessions Raw Redis Cachee
Hot session read ~0.1 µs (local) 0.5-1 ms (network) 0.085 µs (L1)
Cold session read 404 (wrong instance) 0.5-1 ms < 1 ms (L2 + auto-promote)
Horizontal scaling Blocked (affinity required) Possible (every read hits network) Free (any instance, any session)
Instance failure All sessions lost Sessions survive Sessions survive + L1 auto-rebuilds
Rolling deploys Session drain required Seamless Seamless + warm L1 on new instances
Connection pooling N/A You manage it Built-in (32 connections, configurable)
Circuit breaker N/A You build it Built-in (auto-fallback to L1)
Code changes Load balancer config Session store adapter Change host:port (same Redis protocol)
Use Cases

Every industry has a session scaling problem

If your workload generates state that's expensive to recreate and needs to be accessible from any instance, Cachee solves it.

FHE & Privacy Tech

Crypto keypairs (BFV/CKKS contexts, secret keys, relinearization keys) generated per-session. 350KB-2MB per session. Enable encrypted compute across a horizontal fleet.

350KB-2MB sessions

ML Inference

Model context windows, embedding caches, conversation history. Keep inference stateless while maintaining rich session context across any GPU instance in your fleet.

Stateless inference fleet

Gaming

Player sessions, matchmaking state, inventory snapshots. Scale game servers elastically without losing player progress when instances spin up or down.

Elastic server scaling

Trading & Fintech

Order book snapshots, trading session state, risk calculation contexts. Fail over between instances without recomputing expensive position state.

Active-active failover

Healthcare

PHI processing sessions, DICOM viewer state, clinical workflow contexts. HIPAA-compliant session sharing with encryption at rest and TTL-based lifecycle management.

HIPAA-compliant sessions

IoT & Edge

Device twin state, telemetry aggregation windows, command queues. Distribute device session state across edge nodes without centralized coordination.

Edge-distributed state
Security

Session state is sensitive. We treat it that way.

Three layers of protection for your session data, from transport to storage to access control.

Transport Encryption

TLS 1.3 on every connection. Cachee proxy terminates TLS via rediss:// — your session data never travels in plaintext.

Application-Layer Encryption

Wrap sensitive payloads (crypto keys, PHI, tokens) with AES-256-GCM before storing. Cachee sees only opaque blobs. Even a Redis compromise reveals nothing.

Encryption at Rest

ElastiCache backend supports AWS KMS encryption at rest. Combined with your application-layer wrapping, session data is encrypted at every layer of the stack.

Stop pinning sessions. Start scaling.

Cachee speaks Redis protocol — point your session store at us and get L1 caching, connection pooling, and circuit breaking with zero code changes. Your first 10,000 operations are free.