System Design

The Art of Not Doing Work Twice

Your page loads in 3 seconds. You add Redis. Now it loads in 200ms. Then the bug reports start: "I updated my photo but still see the old one." Welcome to cache invalidation.

Bahgat Bahgat Ahmed
· February 2026 · 20 min read
Browser Cache ~0ms
CDN Edge ~20ms
Redis/Memcached CACHE LAYER
Database ~100-500ms
Table of Contents
4 parts

بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ

In the name of Allah, the Most Gracious, the Most Merciful

Your page loads in 3 seconds. Users complain. Your PM asks "can't we just make it faster?"

You add Redis. One line of code. Page loads in 200ms. You're a hero.

Then the bug reports start:

  • "I updated my profile photo but still see the old one"
  • "I cancelled my order but it still shows as active"
  • And the worst one: "I'm seeing someone else's dashboard"

You've just discovered the two hardest problems in computer science: cache invalidation and naming things. And you've hit the first one.

Quick Summary
  • Caching saves time — 1-5ms from cache vs 100-500ms from database
  • Cache invalidation is the hard part — deciding WHEN to update cached data
  • Cache keys matter — wrong keys = security breaches or stale data
  • Thundering herd can kill you — when cache expires and 1000 requests hit the DB at once

Want the full story? Keep reading.

This post is for you if:

  • You're adding caching to speed up a slow application
  • You're getting stale data issues after implementing caching
  • You want to understand caching patterns before you need them
  • You've been burned by database connection issues and want to reduce DB load

Why Cache? The Math

Before diving into the complexities, let's understand why caching is worth the trouble. The numbers are compelling:

The Caching Math: Why It Matters
Without Cache
Database query 100ms
Users/minute 1,000
Queries/minute 1,000
DB time/minute 100 seconds
With Cache (5 min TTL)
Cache read 1-5ms
Users/minute 1,000
DB queries/minute 0.2
DB time/minute 0.02 seconds

With caching, 1000 requests share 1 database query. The database does 5000x less work.

The Math

If data can be stale for 5 minutes, you go from 1000 queries/minute to ~0.2 queries/minute (one query per 5 minutes). That's a 5000x reduction in database load.

New to this? What is caching, really?
The Library Help Desk

Imagine you work at a library help desk. People keep asking "Where are the Harry Potter books?" 50 times a day.

Without a sticky note: You walk to the back office to check the catalog every single time.

With a sticky note: You write down "Harry Potter - Aisle 7, Shelf 3" and answer instantly for the rest of the day.

That sticky note is your cache.

How It Actually Works
1
Request comes in: "Get product #123"
2
Check cache first (1-5ms)
3
If found: return immediately. If not: query database, store in cache, then return.

Popular caching tools:

Redis
Most popular
Memcached
Simple & fast
CDN Cache
Edge locations

The Caching Layers

A typical web application has multiple caching layers. Each layer serves a different purpose and has different characteristics:

The Caching Layers Stack
User Request
Browser Cache
User-specific, controlled by Cache-Control headers
Latency
~0ms
CDN Cache
Geographic distribution, static content
Latency
~10-50ms
Application Cache THIS POST
Redis/Memcached, shared dynamic data
Latency
~1-5ms
Database Query Cache
Built into most databases, often forgotten
Latency
~10-50ms
Database
Source of truth, slowest layer
Latency
~100-500ms

Each layer catches requests before they hit the next layer. Most requests should be served from the upper layers.

Cache Hit vs Cache Miss

Understanding the difference between a cache hit and miss is fundamental to caching:

Cache Hit vs Cache Miss Flow
Cache HIT (Fast Path)
1
Request arrives
2
Check cache: FOUND!
3
Return cached data
Total Time
1-5ms
Cache MISS (Slow Path)
1
Request arrives
2
Check cache: NOT FOUND
3
Query database
4
Store in cache for next time
5
Return data
Total Time
100-500ms

A cache miss connects to database connections. When cache misses spike, your DB takes the hit.

Part 2
The Hard Part: Cache Invalidation

Cache Invalidation Strategies

Data changes. The cached version becomes stale. How does the cache know? This is the core problem of caching.

There are three main strategies, each with tradeoffs:

Cache Invalidation Strategies
TTL-Based
Set expiration time
cache.set(key, data, ttl=300)
Simple to implement
Data stale until expiry
Use WHEN: Data can be slightly stale (product catalog, blog posts)
Event-Based
Invalidate on write
on_update: cache.delete(key)
Always fresh data
Complex with many write points
Use WHEN: Freshness critical (prices, inventory)
Write-Through
Update both together
write(db + cache)
Consistent always
Slower writes
Use WHEN: Read-heavy, write-light workloads
Quick Check
An e-commerce site caches product prices with a 5-minute TTL. A flash sale starts but customers still see old prices for up to 5 minutes. What's the fix?
Reduce TTL to 30 seconds
Add event-based invalidation when prices change
Remove caching for prices entirely
Exactly right

Event-based invalidation clears the cache immediately when prices change. Reducing TTL still leaves a window of staleness, and removing caching entirely would hurt performance. The hybrid approach gives you both speed AND freshness.

Not quite

The best solution is event-based invalidation. When a price changes, immediately delete the cached version. Next request gets fresh data from DB and re-caches it. This gives you both the speed benefits of caching AND guaranteed freshness.

The Invisible Key Mismatch

The most insidious cache bug isn't forgetting to invalidate — it's invalidating the wrong key.

War Story: The Invisible Key Mismatch
The Setup: A chatbot caches prompt responses
Cache key looks reasonable: chatbot:v2:prod:prompt:base
Caching Code Builds
chatbot:v2:prod-test:arabic:False:prompt:base
Includes language + user_exists
Invalidation Code Targets
chatbot:v2:prod-test:prompt:base
Missing language/user_exists!
The Result
WRITE: chatbot:v2:prod-test:arabic:False:prompt:base
DELETE: chatbot:v2:prod-test:prompt:base
The delete silently does nothing. Stale data serves forever. No error. No warning.
The Fix: Single Source of Truth for Cache Keys
Bad: Keys scattered
service_a.py: f"user:{id}:profile"
service_b.py: f"user:{id}:profile:v2"
invalidate.py: f"user-{id}-profile"
Good: Key factory
keys.py: CacheKeys.user_profile(id)
service_a.py: cache.set(CacheKeys...)
invalidate.py: cache.delete(CacheKeys...)

Use WHEN: Multiple components read/write same cached data, cache keys have dynamic parts, team has more than one developer.

DON'T use when: Simple single-purpose cache, one file handles all caching logic.

Cache Stampede Prevention

When cache expires and many requests arrive at once, they ALL hit the database simultaneously. This is the thundering herd problem — and it can take down your system.

Cache Stampede: The Thundering Herd
T = 0
Cache Expires
T = 0.1s
1000 Requests Arrive
All check cache. All miss. All query database.
T = 0.2s
Database Overwhelmed
1000 identical queries. Connection pool exhausted. Timeouts begin.
...
All 1000 requests hit the database simultaneously

Prevention Strategies

Stampede Prevention Strategies
Locking
First request gets a lock and fetches from DB. Others wait for the lock to release, then get cached data.
Result: 1 DB query instead of 1000
Probabilistic Early Refresh
Before actual expiry, randomly refresh in background. By expiry time, fresh data is already cached.
Result: Cache never actually expires
Background Refresh
A background job refreshes cache before expiry. Users always hit warm cache. See async processing.
Result: Zero cache misses

All three prevent the thundering herd. Locking is simplest. Background refresh is most robust for critical paths.

Part 3
Real-World Patterns

Caching User-Specific Data Safely

The scariest caching bug: User A sees User B's data. This happens when cache keys don't properly isolate user data.

Cache Key Design: User Data Safety
DANGEROUS
cache_key = "user_dashboard"
Problem: All users share the same cache! User A sees User B's dashboard.
This is a security breach, not just a bug.
SAFE
cache_key = f"user_dashboard:{user_id}"
Solution: Include user_id in the key. Each user gets their own cache entry.
User A only sees User A's data.
Security Rule

For user-specific data, the cache key MUST include a unique user identifier. This isn't optional. Missing this creates a data leak that exposes private information to other users.

Read-Through vs Write-Through

There are different architectural patterns for how caching integrates with your data flow:

Caching Architecture Patterns
Read-Through
App asks cache
Cache fetches if missing
Returns data
Use WHEN: Read-heavy, can tolerate first-request latency
Write-Through
App writes to cache
Cache writes to DB
Both in sync
Use WHEN: Consistency critical, slower writes OK
Write-Behind
App writes to cache
Return immediately
Async write to DB later
Use WHEN: Write-heavy, eventual consistency OK

CDN & Edge Caching

For static content and public data, CDNs cache at the network edge — geographically close to users.

CDN: Caching at the Edge
Without CDN
User in Tokyo
Server in Virginia
~200ms latency
With CDN
User in Tokyo
CDN Edge in Tokyo
~20ms latency
Good for CDN
  • Static files (JS, CSS, images)
  • Public content (blog posts)
  • API responses same for everyone
NOT for CDN
  • User-specific data
  • Real-time data
  • Sensitive information
Deep Dive: HTTP Cache Headers
Cache-Control: public, max-age=31536000, immutable
Cache for 1 year. Use for static assets with hash in filename.
Cache-Control: public, max-age=300, stale-while-revalidate=60
Cache 5 min, serve stale while fetching fresh. Good for API responses.
Cache-Control: private, no-store
Never cache. Use for user-specific/sensitive data.
Part 4
Reference

When NOT to Cache

Caching isn't always the answer. Sometimes it adds complexity without benefit — or creates more problems than it solves.

When Caching Hurts More Than It Helps
Rapidly Changing Data
Stock prices, live scores, real-time locations. By the time you cache it, it's already stale.
Highly Unique Data
If every request needs different data, cache hit rate is near 0%. You're just adding overhead.
Sensitive User Data
Medical records, financial data, personal messages. Risk of leaking to wrong user outweighs benefits.
Already Fast Operations
If your query is 5ms, adding a 1ms cache check doesn't help much. Optimize the query instead.

Decision Framework

Use this framework to decide if and how to cache something:

Should You Cache This?
Is it read more than written?
No → Probably don't cache
Yes → Continue
Can it be stale for a few seconds/minutes?
No → Need event-based invalidation
Yes → TTL-based is fine
Is it user-specific?
Yes → Include user_id in cache key!
No → Safe to share
Is the source query expensive (>50ms)?
No → Maybe don't need cache
Yes → Good caching candidate
What To Do Monday

Adding caching for the first time?

Start with TTL-based caching on your slowest queries. 5-minute TTL covers most cases. Add event-based invalidation only when needed.

Start simple, add complexity when required

Seeing stale data bugs?

Check your cache invalidation. Are you deleting the EXACT key you're setting? Use a single key factory function.

Audit all cache.set() and cache.delete() calls

Worried about cache stampede?

Add locking around cache miss logic. Only one request should fetch from DB; others wait for the result.

Implement lock-based stampede prevention
Cheat Sheet: Caching

Common Problems

  • Cache invalidation is the hard part
  • Wrong cache key = security breach
  • Cache stampede can kill your DB
  • Stale data breaks user trust

Key Numbers

  • Cache read: 1-5ms
  • DB query: 100-500ms
  • CDN edge: 10-50ms
  • Browser cache: ~0ms

Solutions

  • TTL: Simple, tolerates staleness
  • Event-based: Fresh, more complex
  • Write-through: Always consistent
  • Locking: Prevents stampede
The Bottom Line

Caching is a tradeoff between speed and freshness. Start simple with TTL-based caching, add event-based invalidation where freshness matters, and always include user identifiers in cache keys for user-specific data. The two hardest problems in computer science remain: cache invalidation and naming things.

Caching + Async: Background Refresh

Want to keep caches warm without users ever hitting cold cache? Use async workers to pre-warm caches before TTL expires. Background refresh means users always hit warm cache while workers quietly regenerate expired data in the background.

وَاللَّهُ أَعْلَمُ

And Allah knows best

وَصَلَّى اللَّهُ وَسَلَّمَ وَبَارَكَ عَلَىٰ سَيِّدِنَا مُحَمَّدٍ وَعَلَىٰ آلِهِ

May Allah's peace and blessings be upon our master Muhammad and his family

Was this helpful?

Comments

Loading comments...

Leave a comment