Mastercard and Visa process over 150 billion transactions per year combined. Every single transaction receives a fraud score. These two networks have the strictest latency requirements in the entire payments industry because they sit between the merchant and the issuing bank — every millisecond they add to the authorization path is a millisecond added to every card swipe on Earth. Current architecture: transaction arrives, feature retrieval from distributed feature stores takes 5–20ms, ML model inference takes 1–3ms, decision returned. Feature retrieval dominates. L1 caching at the network’s processing nodes changes the math fundamentally.
The Unique Constraints of Network-Level Fraud
Mastercard’s Decision Intelligence and Visa’s Advanced Authorization operate at a scale and latency requirement that no other fraud system in the world matches. Visa processes over 65,000 transactions per second at peak. Mastercard handles similar volumes. The authorization latency budget — the time from when a transaction enters the network to when a fraud score is returned to the issuing bank — is measured in single-digit milliseconds. Not because they chose this budget, but because the merchant, acquirer, and issuer all have their own processing time, and the total end-to-end authorization must complete in under 2 seconds. The network’s fraud scoring window is a fraction of that.
Unlike Stripe or PayPal, which process transactions on behalf of merchants, Mastercard and Visa see every transaction on every card in every country. This gives them an unparalleled dataset for fraud detection — cross-merchant patterns, cross-issuer velocity, global fraud ring detection — but it also means every feature lookup is against a truly global feature store. The merchant risk score for a small shop in Tokyo and the cardholder velocity for a user in Brazil must both be accessible in microseconds.
The Feature Retrieval Bottleneck at Global Scale
A Mastercard Decision Intelligence scoring request evaluates: the cardholder’s transaction history embedding, the merchant category risk score, the merchant-specific fraud rate, the BIN (bank identification number) risk classification, the geographic anomaly score (is this transaction in an unusual location for this card?), the cross-network velocity (how many transactions has this card made across all merchants in the last N minutes?), the device/channel risk (card-present vs card-not-present, EMV vs magnetic stripe), the transaction amount deviation from the cardholder’s typical pattern, the issuer-level fraud trend, and several proprietary graph features that detect coordinated fraud rings.
Each of these features requires a lookup against a distributed data store. Mastercard and Visa operate processing centers on multiple continents. The feature stores are replicated globally, but even with regional replicas, each lookup involves: a hash to find the correct shard, a network hop to the feature store node (even within the same data center, this is 0.1–0.5ms), serialization/deserialization of the feature vector, and contention under peak load. At 65,000+ TPS, cache contention in the feature store is not theoretical — it is the primary source of p99 latency spikes.
Current: Feature retrieval at network scale
L1 Caching at the Processing Node
Mastercard and Visa operate dedicated processing nodes — high-performance servers that handle transaction routing and fraud scoring. The key architectural insight is that these processing nodes can maintain an L1 in-process cache of hot fraud features. The hot set at the network level is remarkably predictable: the top 10 million active cardholders (by recent transaction activity) cover 90%+ of incoming transactions. The top 500,000 merchants (by transaction volume) cover 95%+ of merchant lookups. Cross-network velocity counters for active cards are accessed repeatedly within short time windows.
At 1.5 microseconds per L1 lookup, 15 features take 22.5 microseconds. Sub-millisecond feature retrieval. Combined with 2ms model inference, the total fraud decisioning time drops to 2.0225ms — sub-5ms with comfortable margin.
With L1 caching at processing nodes
The Memory Budget at Network Scale
Can the hot set actually fit in process memory on a network processing node? The math works. Cardholder embeddings: 10 million active cardholders × 256 dimensions × 4 bytes = 10GB. Merchant risk profiles: 500,000 merchants × 1KB each = 500MB. Velocity counters: 10 million active cards × 64 bytes (multiple time windows) = 640MB. Graph features (pre-computed): 10 million cards × 128 bytes = 1.28GB. BIN risk tables: ~50,000 BINs × 256 bytes = 12.8MB. Total: approximately 12.4GB.
The Scale Impact: Fewer Servers, Lower Power, Better Models
At 5,000+ transactions per second per processing node, every millisecond of latency reduction has a direct impact on throughput. When fraud scoring takes 15ms at p99, each processing thread is occupied for 15ms per transaction. At 2ms, the same thread processes 7.5x more transactions. Across a global network of processing nodes, this means: fewer servers needed at each processing center, lower power consumption (a non-trivial concern when you operate thousands of servers globally), and lower cooling costs.
| Metric | Current | With L1 | Impact |
|---|---|---|---|
| Fraud score latency (p99) | 15-22ms | 2.0225ms | 7-10x faster |
| Feature retrieval | 5-20ms | 0.0225ms | 222-888x faster |
| TPS per processing node | 5,000+ | 35,000+ | 7x throughput |
| Processing nodes needed | N | N/7 | 85% fewer nodes |
But the most impactful benefit is not infrastructure savings — it is model headroom. When fraud scoring takes 2ms instead of 15ms, Mastercard and Visa can run dramatically more sophisticated models without exceeding their latency budgets. Ensemble methods (running 3–5 models and combining scores), real-time graph neural networks for fraud ring detection, transformer-based sequence models that analyze entire transaction histories — all of these require more compute time, and all of them improve detection accuracy. The latency savings from L1 caching directly fund model sophistication.
Visa has publicly stated that its Advanced Authorization prevents approximately $25 billion in fraud annually. Mastercard’s Decision Intelligence reports similar figures. A 10% improvement in detection accuracy from more sophisticated models — made possible by the latency headroom from L1 caching — prevents an additional $2.5–5 billion in annual fraud losses across both networks. That is the true value proposition: not faster fraud scoring for its own sake, but faster fraud scoring as a foundation for fundamentally better fraud detection.
For Mastercard and Visa, every millisecond in the authorization path affects every card transaction on the planet. L1 caching at the processing node level turns feature retrieval from the bottleneck into a rounding error — and unlocks the next generation of fraud models that their data scientists have been waiting to deploy.
Related Reading
Also Read
Sub-Millisecond Feature Retrieval at Global Scale.
L1 caching at 1.5µs per lookup enables sub-5ms fraud decisioning on every transaction, everywhere. See what it means for your network.
Start Free Trial Schedule Demo