Kubernetes Caching Best Practices with Sidecar Patterns

December 21, 2025 • 8 min read • Infrastructure

Kubernetes has transformed how we deploy applications, but caching in a containerized, orchestrated environment requires new patterns. This guide shows you how to implement high-performance caching using sidecar containers, ensuring fast local access with minimal complexity and operational overhead.

Why Sidecar Caching for Kubernetes?

Traditional centralized caching (Redis cluster, Memcached) works in Kubernetes, but sidecars offer compelling advantages:

Ultra-low latency: localhost access eliminates network hops (0.1ms vs. 2-5ms)
Resource isolation: Cache memory doesn't compete with application memory
Zero network overhead: Shared volumes or localhost communication
Automatic scaling: Cache capacity scales with pod replicas
Simplified networking: No need to manage separate cache clusters

The sidecar pattern places a cache container alongside your application container in the same pod, sharing the network namespace and optionally storage volumes.

Pattern 1: Basic Sidecar Cache

Start with a simple Redis sidecar for local caching:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
      # Main application container
      - name: app
        image: myapp:v1.0
        ports:
        - containerPort: 8080
        env:
        - name: CACHE_URL
          value: "redis://localhost:6379"
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"

      # Sidecar cache container
      - name: cache
        image: redis:7-alpine
        ports:
        - containerPort: 6379
        command:
        - redis-server
        - --maxmemory
        - "256mb"
        - --maxmemory-policy
        - allkeys-lru
        resources:
          requests:
            memory: "256Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "200m"

Pro Tip: Set Redis maxmemory to 80% of the container's memory limit to prevent OOM kills. Use allkeys-lru eviction policy for automatic memory management.

Pattern 2: Shared Volume Cache for Persistence

For caches that need to survive container restarts, use an emptyDir volume:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server-persistent
spec:
  template:
    spec:
      volumes:
      - name: cache-storage
        emptyDir:
          medium: Memory  # Use RAM for fast access
          sizeLimit: 512Mi

      containers:
      - name: app
        image: myapp:v1.0
        volumeMounts:
        - name: cache-storage
          mountPath: /cache

      - name: cache
        image: redis:7-alpine
        command:
        - redis-server
        - --dir
        - /data
        - --appendonly
        - "yes"
        - --appendfsync
        - "everysec"
        volumeMounts:
        - name: cache-storage
          mountPath: /data
        resources:
          requests:
            memory: "512Mi"
            cpu: "100m"

Pattern 3: Multi-Tier Sidecar Cache

Combine in-memory and persistent caching with two sidecar containers:

spec:
  template:
    spec:
      containers:
      # Application
      - name: app
        image: myapp:v1.0
        env:
        - name: L1_CACHE_URL
          value: "redis://localhost:6379"
        - name: L2_CACHE_URL
          value: "redis://localhost:6380"

      # L1: Fast in-memory cache
      - name: l1-cache
        image: redis:7-alpine
        ports:
        - containerPort: 6379
        command:
        - redis-server
        - --port
        - "6379"
        - --maxmemory
        - "128mb"
        - --maxmemory-policy
        - volatile-lru
        resources:
          limits:
            memory: "256Mi"

      # L2: Larger persistent cache
      - name: l2-cache
        image: redis:7-alpine
        ports:
        - containerPort: 6380
        command:
        - redis-server
        - --port
        - "6380"
        - --maxmemory
        - "512mb"
        - --appendonly
        - "yes"
        volumeMounts:
        - name: cache-storage
          mountPath: /data
        resources:
          limits:
            memory: "768Mi"

Application code for multi-tier caching:

class TieredCache {
    constructor() {
        this.l1 = new Redis(6379);  // Fast, small
        this.l2 = new Redis(6380);  // Slower, larger
    }

    async get(key) {
        // Try L1 first
        let value = await this.l1.get(key);
        if (value) return value;

        // Try L2
        value = await this.l2.get(key);
        if (value) {
            // Promote to L1
            await this.l1.set(key, value, 'EX', 300);
            return value;
        }

        return null;
    }

    async set(key, value, ttl = 3600) {
        // Write to both tiers
        await Promise.all([
            this.l1.set(key, value, 'EX', Math.min(ttl, 300)),
            this.l2.set(key, value, 'EX', ttl)
        ]);
    }
}

Pattern 4: Service Mesh Integration

Integrate with Istio or Linkerd for advanced traffic management and observability:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  template:
    metadata:
      annotations:
        # Istio sidecar injection
        sidecar.istio.io/inject: "true"

        # Custom cache sidecar
        sidecar.istio.io/userVolume: '{"cache-storage":{"emptyDir":{}}}'
        sidecar.istio.io/userVolumeMount: '{"cache-storage":{"mountPath":"/cache"}}'

    spec:
      containers:
      - name: app
        image: myapp:v1.0

      - name: cache
        image: cachee-ai/sidecar:latest
        ports:
        - containerPort: 6379
          name: cache
        env:
        - name: CACHE_SIZE_MB
          value: "512"
        - name: ML_OPTIMIZATION
          value: "true"

Service Mesh Benefits

Automatic mTLS: Encrypted cache communication
Traffic policies: Rate limiting, circuit breaking
Observability: Built-in metrics and tracing
Zero-trust security: Fine-grained access control

Pattern 5: Init Container for Cache Warming

Pre-populate cache before the main application starts:

spec:
  template:
    spec:
      # Share cache storage between init and main containers
      volumes:
      - name: cache-storage
        emptyDir: {}

      initContainers:
      # Start cache
      - name: cache-init
        image: redis:7-alpine
        command:
        - sh
        - -c
        - |
          redis-server --daemonize yes --dir /data
          sleep 2

      # Warm cache
      - name: cache-warmer
        image: myapp:v1.0
        command: ["node", "warm-cache.js"]
        env:
        - name: CACHE_URL
          value: "redis://localhost:6379"
        volumeMounts:
        - name: cache-storage
          mountPath: /data

      containers:
      - name: app
        image: myapp:v1.0

      - name: cache
        image: redis:7-alpine
        volumeMounts:
        - name: cache-storage
          mountPath: /data

Resource Management Best Practices

Right-Sizing Cache Containers

# Calculate optimal cache size
apiVersion: v1
kind: ConfigMap
metadata:
  name: cache-sizing
data:
  sizing-guide: |
    For each pod:
    - L1 cache: 10-20% of app memory (hot data)
    - L2 cache: 50-100% of app memory (warm data)

    Example: 2GB app container
    - L1: 256MB (Redis sidecar)
    - L2: 1GB (Redis sidecar with persistence)

    Total pod memory: ~3.5GB
    - App: 2GB
    - L1 cache: 256MB
    - L2 cache: 1GB
    - Overhead: ~250MB

CPU Allocation

resources:
  # Cache sidecar CPU allocation
  requests:
    cpu: "100m"      # 0.1 core baseline
  limits:
    cpu: "500m"      # Burst to 0.5 core

  # For high-throughput applications
  requests:
    cpu: "250m"      # 0.25 core baseline
  limits:
    cpu: "1000m"     # Burst to 1 core

Monitoring and Observability

Deploy Prometheus ServiceMonitor for cache metrics:

apiVersion: v1
kind: Service
metadata:
  name: api-server-cache-metrics
  labels:
    app: api-server
spec:
  selector:
    app: api-server
  ports:
  - name: cache-metrics
    port: 9121
    targetPort: 9121

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: cache-sidecar
spec:
  selector:
    matchLabels:
      app: api-server
  endpoints:
  - port: cache-metrics
    interval: 30s

Add Redis exporter as a third sidecar:

- name: cache-exporter
  image: oliver006/redis_exporter:latest
  ports:
  - containerPort: 9121
  env:
  - name: REDIS_ADDR
    value: "localhost:6379"
  resources:
    requests:
      memory: "32Mi"
      cpu: "10m"
    limits:
      memory: "64Mi"
      cpu: "50m"

Zero-Downtime Cache Updates

Use rolling updates with readiness probes:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0

  template:
    spec:
      containers:
      - name: cache
        image: redis:7-alpine
        readinessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 5
          periodSeconds: 5
        livenessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 30
          periodSeconds: 10

Common Pitfalls to Avoid

Over-allocating memory: Set maxmemory lower than container limits
No eviction policy: Always configure maxmemory-policy
Missing resource limits: Cache can starve application of resources
Ignoring persistence: Use volumes for caches that need to survive restarts
No monitoring: Deploy exporters and track hit rates, memory usage

Conclusion

Sidecar caching in Kubernetes offers the best of both worlds: the simplicity of local caching with the scalability of cloud-native infrastructure. Start with a basic Redis sidecar, add persistence as needed, and consider multi-tier patterns for high-traffic applications.

For production deployments, combine sidecar caching with proper resource limits, monitoring, service mesh integration, and ML-powered optimization to achieve 94%+ hit rates with minimal operational overhead.

Deploy ML-Optimized Sidecar Caching

Cachee AI's Kubernetes operator automatically deploys and configures optimized cache sidecars with ML-powered TTL optimization and predictive prefetching.

Start Free Trial