The Art of Not Doing Work Twice: Understanding Caching

بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ

In the name of Allah, the Most Gracious, the Most Merciful

Your page loads in 3 seconds. Users complain. Your PM asks "can't we just make it faster?"

You add Redis. One line of code. Page loads in 200ms. You're a hero.

Then the bug reports start:

"I updated my profile photo but still see the old one"
"I cancelled my order but it still shows as active"
And the worst one: "I'm seeing someone else's dashboard"

You've just discovered the two hardest problems in computer science: cache invalidation and naming things. And you've hit the first one.

Quick Summary

Caching saves time — 1-5ms from cache vs 100-500ms from database
Cache invalidation is the hard part — deciding WHEN to update cached data
Cache keys matter — wrong keys = security breaches or stale data
Thundering herd can kill you — when cache expires and 1000 requests hit the DB at once

Want the full story? Keep reading.

This post is for you if:

You're adding caching to speed up a slow application
You're getting stale data issues after implementing caching
You want to understand caching patterns before you need them
You've been burned by database connection issues and want to reduce DB load

Why Cache? The Math

Before diving into the complexities, let's understand why caching is worth the trouble. The numbers are compelling:

The Caching Math: Why It Matters

Without Cache

Database query 100ms

Users/minute 1,000

Queries/minute 1,000

DB time/minute 100 seconds

With Cache (5 min TTL)

Cache read 1-5ms

Users/minute 1,000

DB queries/minute 0.2

DB time/minute 0.02 seconds

With caching, 1000 requests share 1 database query. The database does 5000x less work.

The Math

If data can be stale for 5 minutes, you go from 1000 queries/minute to ~0.2 queries/minute (one query per 5 minutes). That's a 5000x reduction in database load.

New to this? What is caching, really?

The Library Help Desk

Imagine you work at a library help desk. People keep asking "Where are the Harry Potter books?" 50 times a day.

Without a sticky note: You walk to the back office to check the catalog every single time.

With a sticky note: You write down "Harry Potter - Aisle 7, Shelf 3" and answer instantly for the rest of the day.

That sticky note is your cache.

How It Actually Works

Request comes in: "Get product #123"

Check cache first (1-5ms)

If found: return immediately. If not: query database, store in cache, then return.

Popular caching tools:

Redis

The Caching Layers

A typical web application has multiple caching layers. Each layer serves a different purpose and has different characteristics:

The Caching Layers Stack

User Request

Browser Cache

User-specific, controlled by Cache-Control headers

Latency

~0ms

CDN Cache

Geographic distribution, static content

Latency

~10-50ms

Application Cache THIS POST

Redis/Memcached, shared dynamic data

Latency

~1-5ms

Database Query Cache

Built into most databases, often forgotten

Latency

~10-50ms

Database

Source of truth, slowest layer

Latency

~100-500ms

Each layer catches requests before they hit the next layer. Most requests should be served from the upper layers.

Cache Hit vs Cache Miss

Understanding the difference between a cache hit and miss is fundamental to caching:

Cache Hit vs Cache Miss Flow

Cache HIT (Fast Path)

Request arrives

Check cache: FOUND!

Return cached data

Total Time

1-5ms

Cache MISS (Slow Path)

Request arrives

Check cache: NOT FOUND

Query database

Store in cache for next time

Return data

Total Time

100-500ms

A cache miss connects to database connections. When cache misses spike, your DB takes the hit.

Part 2

The Hard Part: Cache Invalidation

Cache Invalidation Strategies

Data changes. The cached version becomes stale. How does the cache know? This is the core problem of caching.

There are three main strategies, each with tradeoffs:

Cache Invalidation Strategies

TTL-Based

Set expiration time

cache.set(key, data, ttl=300)

Simple to implement

Data stale until expiry

Use WHEN: Data can be slightly stale (product catalog, blog posts)

Event-Based

Invalidate on write

on_update: cache.delete(key)

Always fresh data

Complex with many write points

Use WHEN: Freshness critical (prices, inventory)

Write-Through

Update both together

write(db + cache)

Consistent always

Slower writes

Use WHEN: Read-heavy, write-light workloads

Quick Check

An e-commerce site caches product prices with a 5-minute TTL. A flash sale starts but customers still see old prices for up to 5 minutes. What's the fix?

Reduce TTL to 30 seconds

Add event-based invalidation when prices change

Remove caching for prices entirely

Exactly right

Event-based invalidation clears the cache immediately when prices change. Reducing TTL still leaves a window of staleness, and removing caching entirely would hurt performance. The hybrid approach gives you both speed AND freshness.

Not quite

The best solution is event-based invalidation. When a price changes, immediately delete the cached version. Next request gets fresh data from DB and re-caches it. This gives you both the speed benefits of caching AND guaranteed freshness.

The Invisible Key Mismatch

The most insidious cache bug isn't forgetting to invalidate — it's invalidating the wrong key.

War Story: The Invisible Key Mismatch

The Setup: A chatbot caches prompt responses

                Cache key looks reasonable: chatbot:v2:prod:prompt:base
              

Caching Code Builds

                  chatbot:v2:prod-test:arabic:False:prompt:base
                

Includes language + user_exists

Invalidation Code Targets

                  chatbot:v2:prod-test:prompt:base
                

Missing language/user_exists!

The Result

                  WRITE:
                  chatbot:v2:prod-test:arabic:False:prompt:base
                
                  DELETE:
                  chatbot:v2:prod-test:prompt:base
                
                  The delete silently does nothing. Stale data serves forever. No error. No warning.

The Fix: Single Source of Truth for Cache Keys

Bad: Keys scattered

                    service_a.py: f"user:{id}:profile"

                    service_b.py: f"user:{id}:profile:v2"

                    invalidate.py: f"user-{id}-profile"

Good: Key factory

                    keys.py: CacheKeys.user_profile(id)

                    service_a.py: cache.set(CacheKeys...)

                    invalidate.py: cache.delete(CacheKeys...)

Use WHEN: Multiple components read/write same cached data, cache keys have dynamic parts, team has more than one developer.

DON'T use when: Simple single-purpose cache, one file handles all caching logic.

Cache Stampede Prevention

When cache expires and many requests arrive at once, they ALL hit the database simultaneously. This is the thundering herd problem — and it can take down your system.

Cache Stampede: The Thundering Herd

T = 0

Cache Expires

T = 0.1s

1000 Requests Arrive

All check cache. All miss. All query database.

T = 0.2s

Database Overwhelmed

1000 identical queries. Connection pool exhausted. Timeouts begin.

...

All 1000 requests hit the database simultaneously

Prevention Strategies

Stampede Prevention Strategies

Locking

First request gets a lock and fetches from DB. Others wait for the lock to release, then get cached data.

Result: 1 DB query instead of 1000

Probabilistic Early Refresh

Before actual expiry, randomly refresh in background. By expiry time, fresh data is already cached.

Result: Cache never actually expires

Background Refresh

A background job refreshes cache before expiry. Users always hit warm cache. See async processing.

Result: Zero cache misses

All three prevent the thundering herd. Locking is simplest. Background refresh is most robust for critical paths.

Part 3

Real-World Patterns

Caching User-Specific Data Safely

The scariest caching bug: User A sees User B's data. This happens when cache keys don't properly isolate user data.

Cache Key Design: User Data Safety

DANGEROUS

                cache_key = "user_dashboard"
              

Problem: All users share the same cache! User A sees User B's dashboard.

This is a security breach, not just a bug.

SAFE

                cache_key = f"user_dashboard:{user_id}"
              

Solution: Include user_id in the key. Each user gets their own cache entry.

User A only sees User A's data.

Security Rule

For user-specific data, the cache key MUST include a unique user identifier. This isn't optional. Missing this creates a data leak that exposes private information to other users.

Read-Through vs Write-Through

There are different architectural patterns for how caching integrates with your data flow:

Caching Architecture Patterns

Read-Through

App asks cache

Cache fetches if missing

Returns data

Use WHEN: Read-heavy, can tolerate first-request latency

Write-Through

App writes to cache

Cache writes to DB

Both in sync

Use WHEN: Consistency critical, slower writes OK

Write-Behind

App writes to cache

Return immediately

Async write to DB later

Use WHEN: Write-heavy, eventual consistency OK

CDN & Edge Caching

For static content and public data, CDNs cache at the network edge — geographically close to users.

CDN: Caching at the Edge

Without CDN

User in Tokyo

Server in Virginia

~200ms latency

With CDN

User in Tokyo

CDN Edge in Tokyo

~20ms latency

Good for CDN

Static files (JS, CSS, images)
Public content (blog posts)
API responses same for everyone

NOT for CDN

User-specific data
Real-time data
Sensitive information

Deep Dive: HTTP Cache Headers

Cache-Control: public, max-age=31536000, immutable

Cache for 1 year. Use for static assets with hash in filename.

Cache-Control: public, max-age=300, stale-while-revalidate=60

Cache 5 min, serve stale while fetching fresh. Good for API responses.

Cache-Control: private, no-store

Never cache. Use for user-specific/sensitive data.

Part 4

Reference

When NOT to Cache

Caching isn't always the answer. Sometimes it adds complexity without benefit — or creates more problems than it solves.

When Caching Hurts More Than It Helps

Rapidly Changing Data

Stock prices, live scores, real-time locations. By the time you cache it, it's already stale.

Highly Unique Data

If every request needs different data, cache hit rate is near 0%. You're just adding overhead.

Sensitive User Data

Medical records, financial data, personal messages. Risk of leaking to wrong user outweighs benefits.

Already Fast Operations

If your query is 5ms, adding a 1ms cache check doesn't help much. Optimize the query instead.

Decision Framework

Use this framework to decide if and how to cache something:

Should You Cache This?

Is it read more than written?

No → Probably don't cache

Yes → Continue

Can it be stale for a few seconds/minutes?

No → Need event-based invalidation

Yes → TTL-based is fine

Is it user-specific?

Yes → Include user_id in cache key!

No → Safe to share

Is the source query expensive (>50ms)?

No → Maybe don't need cache

Yes → Good caching candidate

What To Do Monday

Adding caching for the first time?

Start with TTL-based caching on your slowest queries. 5-minute TTL covers most cases. Add event-based invalidation only when needed.

Start simple, add complexity when required

Seeing stale data bugs?

Check your cache invalidation. Are you deleting the EXACT key you're setting? Use a single key factory function.

Audit all cache.set() and cache.delete() calls

Worried about cache stampede?

Add locking around cache miss logic. Only one request should fetch from DB; others wait for the result.

Implement lock-based stampede prevention

Cheat Sheet: Caching

Common Problems

Cache invalidation is the hard part
Wrong cache key = security breach
Cache stampede can kill your DB
Stale data breaks user trust

Key Numbers

Cache read: 1-5ms
DB query: 100-500ms
CDN edge: 10-50ms
Browser cache: ~0ms

Solutions

TTL: Simple, tolerates staleness
Event-based: Fresh, more complex
Write-through: Always consistent
Locking: Prevents stampede

The Bottom Line

Caching is a tradeoff between speed and freshness. Start simple with TTL-based caching, add event-based invalidation where freshness matters, and always include user identifiers in cache keys for user-specific data. The two hardest problems in computer science remain: cache invalidation and naming things.

Caching + Async: Background Refresh

Want to keep caches warm without users ever hitting cold cache? Use async workers to pre-warm caches before TTL expires. Background refresh means users always hit warm cache while workers quietly regenerate expired data in the background.

What to Read Next

The 95% Problem: Understanding DB Connections

What happens when cache misses hit your database hard

Failure Handling: Timeouts, Retries & Circuit Breakers

How to handle cache failures and thundering herd gracefully

Async Processing: Don't Make Users Wait

Background cache warming and async invalidation patterns

وَاللَّهُ أَعْلَمُ

And Allah knows best

وَصَلَّى اللَّهُ وَسَلَّمَ وَبَارَكَ عَلَىٰ سَيِّدِنَا مُحَمَّدٍ وَعَلَىٰ آلِهِ

The Art of Not Doing Work Twice

Why Cache? The Math

The Caching Layers

Cache Hit vs Cache Miss

Cache Invalidation Strategies

The Invisible Key Mismatch

Cache Stampede Prevention

Prevention Strategies

Caching User-Specific Data Safely

Read-Through vs Write-Through

CDN & Edge Caching

When NOT to Cache

Decision Framework

Adding caching for the first time?

Seeing stale data bugs?

Worried about cache stampede?

Common Problems

Key Numbers

Solutions

What to Read Next

Was this helpful?

Comments

Leave a comment

The Art of Not Doing Work Twice

Why Cache? The Math

The Caching Layers

Cache Hit vs Cache Miss

Cache Invalidation Strategies

The Invisible Key Mismatch

Cache Stampede Prevention

Prevention Strategies

Caching User-Specific Data Safely

Read-Through vs Write-Through

CDN & Edge Caching

When NOT to Cache

Decision Framework

Adding caching for the first time?

Seeing stale data bugs?

Worried about cache stampede?

Common Problems

Key Numbers

Solutions

What to Read Next

Was this helpful?

Comments

Leave a comment

Share this article

Enjoyed this article?