How It Works
5G Telecom Trading MEV
Pricing Blog Docs Start Free Trial

MEC Architecture: Deploying Cachee at the 5G Edge

February 8, 2026 • 7 min read • Architecture

Multi-access Edge Computing (MEC) moves computation from centralized clouds to the network edge—right inside the carrier infrastructure, meters from the cell tower. Cachee deploys natively in this environment as a containerized service within carrier MEC Kubernetes, sitting between the User Plane Function (UPF) and application servers.

Zero changes to the 5G core. Zero disruption to existing traffic flows. Under 30 seconds to deploy.

<30s
Deployment time
94-98%
Cache hit rate
1.21ns
L1 cache latency
0
5G core changes

The Four-Layer Architecture

Cachee slots into the existing 5G MEC stack without requiring modifications to any layer above or below it. Here is how each layer fits together.

Layer 1

User Plane

UE devices (phones, IoT sensors, AR headsets) connect through the 5G Radio Access Network (gNodeB) and are routed by the User Plane Function (UPF). This layer is entirely standard 3GPP. Cachee does not touch it.

Layer 2

Cachee MEC Layer

This is where Cachee lives. Four services run as containers within the carrier's MEC Kubernetes cluster:

Layer 3

Carrier Core

The 5G core network functions remain untouched. Cachee operates entirely in the user plane—the control plane never knows it exists.

Layer 4

Origin / Cloud

The backend servers where content originates. Cachee reduces the load on these by 94-98% through predictive caching.

Architecture Diagram

                        5G MEC Architecture with Cachee
    ============================================================

    LAYER 1: USER PLANE
    +----------+     +-----------+     +-------+
    |    UE    | --> |  gNodeB   | --> |  UPF  |
    | Devices  |     | (5G RAN)  |     |       |
    +----------+     +-----------+     +---+---+
                                           |
                          traffic steering  |
                                           v
    ----------------------------------------+-------------------
    LAYER 2: CACHEE MEC (Kubernetes)        |
    +---------------------------------------+----------------+
    |                                                        |
    |  +------------------+    +------------------------+    |
    |  | AI Prediction    |    | L1 In-Process Cache    |    |
    |  | Engine           |--->| (1.21ns latency)       |    |
    |  | LSTM+Trans.+RL   |    +----------+-------------+    |
    |  +------------------+               |                  |
    |                           HIT (94-98%) | MISS (2-6%)   |
    |                                v       v               |
    |  +------------------+    +------------------------+    |
    |  | Compliance       |    | Edge Proxy             |    |
    |  | Engine (30+ regs)|    | (NVMe backing)         |    |
    |  +------------------+    +----------+-------------+    |
    |                                     |                  |
    +-------------------------------------+------------------+
                                          |
    --------------------------------------+------------------
    LAYER 3: CARRIER CORE                 |
    +-------------+  +-----------+        |
    |  AMF / SMF  |  | Network   |        |
    | (unchanged) |  | Slicing   |        |
    +-------------+  +-----------+        |
                                          |
    --------------------------------------+------------------
    LAYER 4: ORIGIN / CLOUD              |
    +----------+  +----------+  +---------+------+
    | AWS/GCP/ |  | Content  |  | API            |
    | Azure    |  | Origins  |  | Backends       |
    +----------+  +----------+  +----------------+

Request Flow: Cache Hit (94-98% of Requests)

The vast majority of requests never leave the MEC boundary. Here is the timing breakdown for a cache hit:

Step Component Cumulative Time
1. UE Request Device sends request t = 0 ms
2. 5G RAN gNodeB radio processing t = 4 ms
3. UPF Steering Traffic routed to MEC t = 7 ms
4. Cachee AI Check Prediction engine lookup t = 7.5 ms
5. L1 Cache HIT In-process memory return t = 7.5 ms
6. Response to UE Return through RAN t ≈ 10.5 ms
10.5ms end-to-end. From the moment a user taps their screen to the moment data arrives back at their device. The physics of 5G radio adds ~7ms; Cachee adds 0.5ms. The rest is return path. Compare this to the 50-100ms typical of CDN-served 5G content.

Request Flow: Cache Miss (2-6% of Requests)

When the AI engine does not have a prediction and the L1 cache does not contain the requested content, the request falls through to origin:

Step Component Cumulative Time
1. UE Request Device sends request t = 0 ms
2. 5G RAN gNodeB radio processing t = 4 ms
3. UPF Steering Traffic routed to MEC t = 7 ms
4. Cachee AI Check Prediction engine — MISS t = 7.5 ms
5. Edge Proxy Forward Request forwarded to origin t = 8 ms
6. Origin Fetch Cloud backend responds t = 20–35 ms
7. Cache + Respond Store in L1, return to UE t = 25–40 ms

Even on a miss, Cachee is still faster than a traditional architecture because the edge proxy maintains persistent connection pools to origin servers. And critically, the fetched content is now cached—every subsequent request for the same content hits the 10.5ms path.

AI-Powered Predictive Pre-Fetching

The most powerful component of the Cachee MEC layer is the AI Prediction Engine. Rather than waiting for a cache miss to occur, Cachee predicts what content will be requested up to 30 minutes ahead and pre-fetches it into the L1 cache.

The engine combines three models:

This is why cache hit rates reach 94-98%. Traditional CDNs use static TTL rules and reactive caching. Cachee proactively fills the cache with content it knows will be requested. The 2-6% miss rate consists almost entirely of genuinely novel, never-before-seen content.

Network Slice Integration

5G network slicing allows carriers to create dedicated virtual networks for different traffic types. Cachee integrates with the slicing architecture to provide differentiated caching behavior:

Slice Type Cachee Behavior Typical Use Case
eMBB (Enhanced Mobile Broadband) Aggressive pre-fetch, large L1 allocation 4K/8K video, cloud gaming
URLLC (Ultra-Reliable Low-Latency) Minimal processing overhead, priority L1 path V2X, industrial control
mMTC (Massive Machine-Type Comms) High-cardinality key handling, compact values IoT sensor networks
Custom Enterprise Compliance-first routing, tenant isolation Private 5G, campus networks

Carriers can dedicate a low-latency slice specifically for Cachee-optimized traffic. Requests on this slice get priority UPF steering to the MEC layer, bypassing the carrier core entirely for cached content. The result is the lowest possible latency path: radio to cache to radio, with nothing in between.

Deployment: Under 30 Seconds

Cachee ships as a set of container images that deploy into any Kubernetes-based MEC platform. The deployment manifest is straightforward:

# cachee-mec-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cachee-mec
  namespace: edge-services
spec:
  replicas: 3
  selector:
    matchLabels:
      app: cachee-mec
  template:
    spec:
      containers:
      - name: cachee-ai-engine
        image: cachee/mec-ai:latest
        resources:
          requests: { cpu: "4", memory: "8Gi" }
          limits:   { cpu: "8", memory: "16Gi" }
        env:
        - name: PREDICTION_WINDOW_MIN
          value: "30"
        - name: L1_CACHE_MAX_ENTRIES
          value: "10000000"
      - name: cachee-edge-proxy
        image: cachee/mec-proxy:latest
        resources:
          requests: { cpu: "2", memory: "4Gi" }
        volumeMounts:
        - name: nvme-cache
          mountPath: /cache
      - name: cachee-compliance
        image: cachee/mec-compliance:latest
        env:
        - name: REGULATIONS
          value: "gdpr,hipaa,pci-dss,ccpa"
      volumes:
      - name: nvme-cache
        hostPath:
          path: /mnt/nvme0

Three kubectl apply commands and Cachee is live at the edge. Container orchestration handles rolling updates, health checks, and auto-scaling. No carrier network engineer needs to touch the 5G core configuration.

Important distinction: Cachee operates exclusively in the user plane. It does not interact with the 5G control plane (AMF, SMF, NSSF). This means no 3GPP certification is required for the caching layer, and carrier network operations teams can deploy it as a standard MEC application.

Why MEC Instead of Central Cloud?

The physics are simple. A centralized cloud cache sits 20-50ms away from the user (through the 5G core, transport network, and cloud ingress). A MEC-deployed cache sits 3-7ms away (just the radio hop). Cachee's L1 cache at the MEC edge adds effectively zero latency on top of that radio hop.

For applications that need sub-15ms response times—cloud gaming, AR overlays, real-time collaboration—the difference between 50ms and 10.5ms is the difference between usable and unusable. MEC is the only deployment model that gets there.

What Cachee Does Not Replace

Cachee is not a CDN, not a 5G core component, and not a replacement for origin servers. It is a predictive caching layer that:

The 5G core, carrier CDN, and cloud backends all continue to function exactly as they do today. Cachee simply intercepts the requests it can serve faster and lets everything else pass through unchanged.

See the Full Architecture

Interactive diagrams, deployment guides, and integration details for carrier MEC environments.

Explore 5G Telecom →