Multi-Tenant Caching Architecture for SaaS Applications
Multi-tenant SaaS applications serve hundreds or thousands of customers from shared infrastructure. Caching in multi-tenant environments introduces unique challenges: tenant isolation, fair resource allocation, noisy neighbor prevention, and per-tenant cost tracking. This guide shows you how to design caching architectures that scale efficiently while maintaining strict tenant boundaries.
Multi-Tenant Caching Challenges
- Tenant isolation: Prevent data leakage between tenants
- Noisy neighbors: One tenant shouldn't consume all cache resources
- Fair resource allocation: Balance cache space across tenants
- Variable usage patterns: Tenants have different access patterns
- Cost attribution: Track cache costs per tenant
- Scaling heterogeneity: Different tenant tiers need different resources
Architecture Pattern 1: Shared Cache with Namespacing
The simplest approach: all tenants share the same cache infrastructure with tenant prefixes.
// Tenant-aware cache wrapper
class TenantCache {
constructor(tenantId, baseCache) {
this.tenantId = tenantId;
this.cache = baseCache;
}
// Automatically prefix keys with tenant ID
_key(key) {
return `tenant:${this.tenantId}:${key}`;
}
async get(key) {
return this.cache.get(this._key(key));
}
async set(key, value, options) {
return this.cache.set(this._key(key), value, options);
}
async delete(key) {
return this.cache.delete(this._key(key));
}
// Bulk operations scoped to tenant
async deleteAll() {
const pattern = `tenant:${this.tenantId}:*`;
return this.cache.deletePattern(pattern);
}
}
// Usage in application
app.use((req, res, next) => {
const tenantId = req.headers['x-tenant-id'];
req.cache = new TenantCache(tenantId, globalCache);
next();
});
// Automatic tenant isolation
app.get('/api/users/:id', async (req, res) => {
// Cache key automatically scoped to tenant
const user = await req.cache.get(`user:${req.params.id}`);
// Stored as: tenant:acme-corp:user:123
});
Advantages
- Simple implementation
- Cost-effective (shared infrastructure)
- Easy to deploy and manage
Disadvantages
- No resource isolation (noisy neighbor problem)
- Large tenant can evict small tenant's data
- No per-tenant memory guarantees
Architecture Pattern 2: Cache Partitioning by Tenant Tier
Separate cache clusters for different tenant tiers (free, pro, enterprise).
const CACHE_CLUSTERS = {
free: {
host: 'cache-free.company.com',
maxMemory: '4GB',
maxTenants: 10000
},
pro: {
host: 'cache-pro.company.com',
maxMemory: '32GB',
maxTenants: 1000
},
enterprise: {
host: 'cache-enterprise.company.com',
maxMemory: '128GB',
maxTenants: 100
}
};
class TierBasedCacheRouter {
constructor() {
this.caches = {};
// Initialize cache clients for each tier
for (const [tier, config] of Object.entries(CACHE_CLUSTERS)) {
this.caches[tier] = new Redis(config);
}
}
getCacheForTenant(tenantId) {
const tier = this.getTenantTier(tenantId);
return new TenantCache(tenantId, this.caches[tier]);
}
async getTenantTier(tenantId) {
// Fetch from database or cache
return await tenantService.getTier(tenantId);
}
}
// Usage
const router = new TierBasedCacheRouter();
app.use(async (req, res, next) => {
const tenantId = req.headers['x-tenant-id'];
req.cache = await router.getCacheForTenant(tenantId);
next();
});
Benefits
- Tier-appropriate resources
- Noisy neighbor isolation across tiers
- Easy to scale specific tiers
- Clear cost attribution
Architecture Pattern 3: Per-Tenant Resource Quotas
Enforce cache memory limits per tenant within shared infrastructure.
class QuotaEnforcedCache {
constructor(tenantId, baseCache) {
this.tenantId = tenantId;
this.cache = baseCache;
this.quotas = this.loadQuotas();
}
async loadQuotas() {
// Per-tenant limits
return {
maxMemoryMB: await this.getMemoryQuota(),
maxKeys: await this.getKeyQuota(),
maxTTL: await this.getMaxTTL()
};
}
async getMemoryQuota() {
const tier = await tenantService.getTier(this.tenantId);
const quotas = {
free: 10, // 10MB
pro: 100, // 100MB
enterprise: 1000 // 1GB
};
return quotas[tier];
}
async set(key, value, options = {}) {
// Check quota before writing
const currentUsage = await this.getMemoryUsage();
const valueSize = this.estimateSize(value);
if (currentUsage + valueSize > this.quotas.maxMemoryMB * 1024 * 1024) {
// Quota exceeded
await this.enforceQuota(valueSize);
}
// Enforce max TTL
const ttl = Math.min(options.ttl || 3600, this.quotas.maxTTL);
return this.cache.set(
this._key(key),
value,
{ ...options, ttl }
);
}
async enforceQuota(requiredSpace) {
// Evict least recently used keys for this tenant
const keys = await this.cache.keys(`tenant:${this.tenantId}:*`);
const keyMeta = await Promise.all(
keys.map(async k => ({
key: k,
lastAccess: await this.cache.object('idletime', k)
}))
);
// Sort by idle time (LRU)
keyMeta.sort((a, b) => b.lastAccess - a.lastAccess);
// Evict oldest until space available
let freedSpace = 0;
for (const { key } of keyMeta) {
if (freedSpace >= requiredSpace) break;
const size = await this.cache.memoryUsage(key);
await this.cache.del(key);
freedSpace += size;
this.logQuotaEviction(key);
}
}
async getMemoryUsage() {
const keys = await this.cache.keys(`tenant:${this.tenantId}:*`);
const sizes = await Promise.all(
keys.map(k => this.cache.memoryUsage(k))
);
return sizes.reduce((sum, size) => sum + size, 0);
}
estimateSize(value) {
return Buffer.byteLength(JSON.stringify(value));
}
logQuotaEviction(key) {
logger.warn(`Quota eviction for tenant ${this.tenantId}: ${key}`);
metrics.increment('cache.quota_eviction', {
tenant: this.tenantId
});
}
}
Architecture Pattern 4: Dedicated Cache per Large Tenant
Enterprise tenants get dedicated cache instances for guaranteed performance.
class HybridCacheRouter {
constructor() {
this.sharedCache = new Redis({ host: 'cache-shared.company.com' });
this.dedicatedCaches = new Map();
}
async getCacheForTenant(tenantId) {
const config = await this.getTenantConfig(tenantId);
if (config.dedicatedCache) {
// Enterprise tenant with dedicated cache
if (!this.dedicatedCaches.has(tenantId)) {
this.dedicatedCaches.set(tenantId, new Redis({
host: config.cacheHost,
password: config.cachePassword
}));
}
return this.dedicatedCaches.get(tenantId);
} else {
// Standard tenant on shared cache
return new TenantCache(tenantId, this.sharedCache);
}
}
async getTenantConfig(tenantId) {
return await db.query(
'SELECT tier, dedicated_cache, cache_host FROM tenants WHERE id = ?',
[tenantId]
);
}
}
When to Use Dedicated Caches
- Enterprise SLA requirements
- Regulatory/compliance isolation needs
- Tenants with >10% of total traffic
- High-value customers justifying infrastructure cost
Fair Resource Allocation Strategies
Strategy 1: Time-Based Quotas
// Limit tenant to N requests per minute
class RateLimitedCache {
async checkRateLimit(tenantId) {
const key = `ratelimit:${tenantId}:${this.getCurrentMinute()}`;
const count = await this.cache.incr(key);
if (count === 1) {
await this.cache.expire(key, 60);
}
const limit = await this.getRequestLimit(tenantId);
if (count > limit) {
throw new Error(`Rate limit exceeded for tenant ${tenantId}`);
}
}
async getRequestLimit(tenantId) {
const tier = await tenantService.getTier(tenantId);
return {
free: 1000, // 1K requests/minute
pro: 10000, // 10K requests/minute
enterprise: 100000 // 100K requests/minute
}[tier];
}
}
Strategy 2: Weighted Fair Queuing
// Prioritize cache operations by tenant tier
class WeightedCacheQueue {
constructor() {
this.queues = {
enterprise: [],
pro: [],
free: []
};
this.weights = {
enterprise: 50, // 50% of cache bandwidth
pro: 35, // 35% of cache bandwidth
free: 15 // 15% of cache bandwidth
};
}
async execute(tenantId, operation) {
const tier = await tenantService.getTier(tenantId);
this.queues[tier].push(operation);
return this.processQueues();
}
async processQueues() {
// Weighted round-robin processing
// Enterprise gets 50% of slots, pro 35%, free 15%
const batchSize = 100;
const batch = [];
batch.push(...this.queues.enterprise.splice(0, 50));
batch.push(...this.queues.pro.splice(0, 35));
batch.push(...this.queues.free.splice(0, 15));
return Promise.all(batch.map(op => op()));
}
}
Monitoring Multi-Tenant Caches
Per-Tenant Metrics
class TenantMetrics {
async recordOperation(tenantId, operation, duration, result) {
// Record per-tenant metrics
await metrics.record(`cache.${operation}`, duration, {
tenant: tenantId,
result: result, // hit, miss, error
tier: await this.getTier(tenantId)
});
// Track tenant-specific stats
await this.updateTenantStats(tenantId, {
operation,
duration,
result
});
}
async updateTenantStats(tenantId, data) {
const key = `stats:tenant:${tenantId}`;
const stats = await this.cache.get(key) || {
totalOps: 0,
hits: 0,
misses: 0,
avgLatency: 0,
memoryUsed: 0
};
stats.totalOps++;
if (data.result === 'hit') stats.hits++;
if (data.result === 'miss') stats.misses++;
stats.avgLatency = (stats.avgLatency * (stats.totalOps - 1) + data.duration) / stats.totalOps;
await this.cache.set(key, stats, { ttl: 3600 });
}
async getTenantDashboard(tenantId) {
const stats = await this.cache.get(`stats:tenant:${tenantId}`);
return {
hitRate: stats.hits / stats.totalOps,
avgLatency: stats.avgLatency,
memoryUsed: await this.getMemoryUsage(tenantId),
quota: await this.getMemoryQuota(tenantId)
};
}
}
Alerting on Noisy Neighbors
// Detect tenants consuming excessive resources
async function detectNoisyNeighbors() {
const tenants = await getAllTenants();
for (const tenant of tenants) {
const usage = await getMemoryUsage(tenant.id);
const avgUsage = await getAverageMemoryUsage();
if (usage > avgUsage * 5) {
// Tenant using 5x average
await alertOps({
type: 'noisy_neighbor',
tenant: tenant.id,
usage: usage,
average: avgUsage
});
// Optionally throttle
await applyThrottling(tenant.id);
}
}
}
Cost Attribution and Billing
// Track cache costs per tenant
class TenantBilling {
async calculateCacheCosts(tenantId, month) {
const stats = await this.getTenantStats(tenantId, month);
const costs = {
memory: stats.avgMemoryGB * COST_PER_GB_MONTH,
operations: stats.totalOps * COST_PER_MILLION_OPS / 1e6,
bandwidth: stats.bandwidthGB * COST_PER_GB_BANDWIDTH
};
return {
total: Object.values(costs).reduce((a, b) => a + b, 0),
breakdown: costs
};
}
async getTenantStats(tenantId, month) {
// Aggregate monthly stats from time-series data
return await metricsDB.query(`
SELECT
AVG(memory_bytes) / 1e9 as avgMemoryGB,
SUM(operations) as totalOps,
SUM(bandwidth_bytes) / 1e9 as bandwidthGB
FROM cache_metrics
WHERE tenant_id = ? AND month = ?
`, [tenantId, month]);
}
}
Best Practices Summary
- Always namespace keys with tenant ID to prevent data leakage
- Implement quotas to prevent noisy neighbors
- Monitor per-tenant metrics for usage tracking and billing
- Use tiered infrastructure for different customer segments
- Provide dedicated caches for high-value enterprise customers
- Track costs per tenant for accurate billing and chargeback
- Alert on anomalies to catch abuse or misconfigurations
Conclusion
Multi-tenant caching requires careful design to balance efficiency, isolation, and fairness. Start with shared caches and namespacing for simplicity. Add quotas and rate limiting to prevent noisy neighbors. Implement tiered infrastructure as you scale. Provide dedicated caches for enterprise customers with strict SLAs. Always monitor per-tenant metrics for billing, optimization, and anomaly detection.
The right architecture depends on your tenant distribution, SLA requirements, and cost constraints. Most SaaS applications benefit from a hybrid approach: shared caches for small tenants, dedicated infrastructure for large enterprise customers.
Multi-Tenant Caching Built-In
Cachee.ai automatically handles tenant isolation, quotas, and fair resource allocation with zero configuration.
Start Free Trial