At 50,000 requests/sec, your database is the first thing that dies. Redis is what keeps it alive — but only if you're using it correctly. Most engineers slap on a basic key-value cache and call it a day. That's leaving 80% of the performance gains on the table.
After managing infrastructure at FAANG scale, here are the 6 Redis caching patterns that actually move the needle on cost and reliability.
Why Redis Over a Simple In-Memory Cache?
Before patterns — why Redis specifically?
- Shared across all app instances — in-memory caches are per-process, so horizontal scaling creates cache misses on every new pod
- Persistence options — RDB snapshots + AOF for disaster recovery
- Rich data structures — hashes, sorted sets, streams (not just key-value)
- Pub/sub + Lua scripting — enables atomic operations you can't do in a simple map
- Eviction policies — LRU, LFU, TTL-based — automatic memory management
At 10 app pods, an in-memory cache gives you 10x the cache misses of a shared Redis instance.
Pattern 1: Cache-Aside (Lazy Loading)
The most common pattern. App checks cache first; on miss, loads from DB and writes to cache.
import redis
import json
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
def get_user(user_id: int) -> dict:
cache_key = f"user:{user_id}"
# 1. Check cache
cached = r.get(cache_key)
if cached:
return json.loads(cached)
# 2. Cache miss — fetch from DB
user = db.query("SELECT * FROM users WHERE id = %s", user_id)
# 3. Populate cache with TTL
r.setex(cache_key, 3600, json.dumps(user)) # 1 hour TTL
return user
When to use: Read-heavy workloads where cache misses are acceptable (user profiles, product pages, blog posts).
Tradeoff: First request after cache expiry hits the DB — acceptable for most use cases.
Real impact: A 100-RPS endpoint with 95% cache hit rate means only 5 DB queries/sec instead of 100. At AWS RDS pricing ($0.10–0.30/hour per vCPU), this translates to running a db.t3.medium instead of a db.r5.2xlarge.
Pattern 2: Write-Through
Write to cache and DB simultaneously. Cache is always warm — no cold-start misses.
def update_user(user_id: int, data: dict) -> dict:
cache_key = f"user:{user_id}"
# 1. Write to DB first
updated_user = db.execute(
"UPDATE users SET name=%s, email=%s WHERE id=%s RETURNING *",
data['name'], data['email'], user_id
)
# 2. Immediately update cache
r.setex(cache_key, 3600, json.dumps(updated_user))
return updated_user
When to use: Data that's read frequently right after being written (order status, user settings, config values).
Tradeoff: Higher write latency (two writes per update). Cache may store data that's never read (wasted memory for write-heavy datasets).
Optimization: Combine with a background job to expire unused keys weekly.
Pattern 3: Write-Behind (Write-Back)
Write to cache immediately, flush to DB asynchronously. Ultra-fast writes.
import asyncio
from collections import deque
write_queue = deque()
def update_counter(entity_id: int, delta: int):
cache_key = f"counter:{entity_id}"
# Atomic increment in Redis — O(1), sub-millisecond
new_value = r.incrby(cache_key, delta)
# Queue DB flush (async, non-blocking)
write_queue.append((entity_id, new_value))
return new_value
async def flush_to_db():
"""Background task — runs every 5 seconds"""
while True:
await asyncio.sleep(5)
while write_queue:
entity_id, value = write_queue.popleft()
await db.execute(
"UPDATE counters SET value=%s WHERE id=%s",
value, entity_id
)
When to use: High-frequency counters (page views, likes, inventory updates), real-time leaderboards, rate limiting.
Risk: Data loss on Redis crash if not persisted. Mitigate with Redis AOF persistence (appendonly yes in config).
Real impact: A social platform tracking 10M daily "like" events can do this entirely in Redis at $50/mo (ElastiCache) instead of hammering a $500/mo RDS with 10M writes/day.
Pattern 4: Read-Through
Redis sits in front of DB as a transparent proxy. App only talks to Redis — cache loads itself on miss.
This pattern is typically implemented at the infrastructure level using tools like Twemproxy, KeyDB, or a caching middleware layer rather than application code.
class ReadThroughCache:
def __init__(self, redis_client, db_loader, ttl=3600):
self.redis = redis_client
self.loader = db_loader # callable: key -> value
self.ttl = ttl
def get(self, key: str):
value = self.redis.get(key)
if value is not None:
return json.loads(value)
# Transparent load from DB
value = self.loader(key)
if value:
self.redis.setex(key, self.ttl, json.dumps(value))
return value
# Usage — app doesn't know about DB at all
cache = ReadThroughCache(r, lambda k: db.get_by_key(k))
user = cache.get(f"user:{user_id}")
When to use: Clean separation of concerns — useful when multiple services need the same cached data without duplicating cache-aside logic.
Pattern 5: Cache Stampede Prevention (Mutex Lock)
The silent killer. When a popular key expires, hundreds of requests hit the DB simultaneously. This is a thundering herd problem.
import time
import uuid
def get_with_lock(key: str, db_loader, ttl=3600, lock_ttl=10):
# 1. Check cache
value = r.get(key)
if value:
return json.loads(value)
# 2. Try to acquire lock
lock_key = f"lock:{key}"
lock_id = str(uuid.uuid4())
acquired = r.set(lock_key, lock_id, nx=True, ex=lock_ttl)
if acquired:
try:
# 3. We hold the lock — load from DB
value = db_loader(key)
r.setex(key, ttl, json.dumps(value))
return value
finally:
# 4. Release lock (only if we own it)
lua_script = """
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
else
return 0
end
"""
r.eval(lua_script, 1, lock_key, lock_id)
else:
# 5. Another process is loading — wait and retry
for _ in range(20):
time.sleep(0.1)
value = r.get(key)
if value:
return json.loads(value)
# Fallback: load directly (lock holder may have crashed)
return db_loader(key)
When to use: Any high-traffic endpoint where a single cache key expires — product pages, homepage content, leaderboards.
Simpler alternative: Use probabilistic early expiration (PER) — randomly refresh the key before it expires:
def get_with_per(key: str, db_loader, ttl=3600, beta=1.0):
"""Probabilistic Early Revalidation — no locks needed"""
data = r.get(key)
if data:
cached = json.loads(data)
remaining_ttl = r.ttl(key)
# Probabilistically refresh before expiry
if remaining_ttl < ttl * 0.1: # Last 10% of TTL
import random, math
if random.random() < beta * math.log(ttl / max(remaining_ttl, 1)) / ttl:
value = db_loader(key)
r.setex(key, ttl, json.dumps(value))
return value
return cached
value = db_loader(key)
r.setex(key, ttl, json.dumps(value))
return value
Pattern 6: Cache Tagging (Invalidation Groups)
The hardest problem in caching: invalidation. When you update a product, you need to invalidate the product page, the category page, and the search results — not just product:123.
def set_with_tags(key: str, value, tags: list, ttl=3600):
"""Store a value and register it under multiple invalidation tags"""
pipe = r.pipeline()
# Store the value
pipe.setex(key, ttl, json.dumps(value))
# Register key under each tag (using Redis Sets)
for tag in tags:
tag_key = f"tag:{tag}"
pipe.sadd(tag_key, key)
pipe.expire(tag_key, ttl + 60) # Tag set lives slightly longer
pipe.execute()
def invalidate_tag(tag: str):
"""Invalidate all keys associated with a tag"""
tag_key = f"tag:{tag}"
keys = r.smembers(tag_key)
if keys:
pipe = r.pipeline()
for key in keys:
pipe.delete(key)
pipe.delete(tag_key)
pipe.execute()
return len(keys)
# Usage
set_with_tags(
f"product:{product_id}",
product_data,
tags=[f"category:{category_id}", "products:all", f"brand:{brand_id}"]
)
# When a category updates — invalidate everything tagged to it
invalidate_tag(f"category:{category_id}")
When to use: E-commerce, CMS, any system where one entity change affects multiple cached views.
Production Redis Configuration
Don't run Redis with default settings. For production:
# /etc/redis/redis.conf
# Memory limit — always set this
maxmemory 2gb
maxmemory-policy allkeys-lru
# Persistence — AOF for write-behind pattern safety
appendonly yes
appendfsync everysec
# Eviction — LRU for general caching workloads
# Use allkeys-lfu for skewed access patterns (power-law distribution)
# Network
tcp-keepalive 60
timeout 300
# Performance
hz 20
lazyfree-lazy-eviction yes
lazyfree-lazy-expire yes
For AWS ElastiCache, set these via parameter groups. The maxmemory-policy is the single most important parameter — noeviction (the default) will crash your cache when it fills up.
Choosing the Right Pattern
| Pattern | Best For | DB Load | Write Latency | Complexity | |---------|----------|---------|---------------|------------| | Cache-Aside | Read-heavy, stale OK | Low | Normal | Low | | Write-Through | Recent writes read immediately | Low | High | Medium | | Write-Behind | High-frequency counters | Very Low | Very Low | High | | Read-Through | Clean architecture | Low | Normal | Medium | | Stampede Lock | Viral content, spiky traffic | Protected | Normal | High | | Cache Tagging | Complex invalidation | Low | Normal | High |
The Real Cost Math
Here's what proper Redis caching actually saves on AWS:
Scenario: 500 RPS to a product catalog endpoint, db.r5.large ($0.24/hr)
- Without cache: 500 RPS × DB query time 20ms = DB at 100% CPU → need
db.r5.2xlarge($0.96/hr) - With 95% cache hit rate: 25 RPS to DB →
db.t3.medium($0.068/hr) handles it comfortably - Redis ElastiCache
cache.t3.medium: $0.068/hr
Monthly savings: ($0.96 - $0.068 - $0.068) × 730 hours = ~$600/mo on one endpoint alone.
At scale, proper caching typically saves 40-70% on RDS costs. Redis pays for itself within the first week.
Common Mistakes to Avoid
- No TTL on keys — memory fills up, Redis starts evicting data randomly or crashes
- Caching mutable user-specific data without user context in the key — user A sees user B's data (security incident)
- JSON serializing large objects — serialize only what you need; a 50KB JSON blob in Redis is 50KB you're paying for
- Not handling cache misses gracefully — if Redis goes down, your app should fall back to DB, not crash
- Storing sessions in Redis without replication — single-node Redis failure logs out every user simultaneously
Always test your fallback path: redis-cli DEBUG SLEEP 30 simulates Redis being slow/unresponsive.
Redis done right is the difference between a $200/mo RDS bill and a $2,000/mo one. Pick the pattern that matches your access pattern, set your TTLs deliberately, and you'll cut infrastructure costs while improving p99 latency at the same time.