Async Processing: Don't Make Users Wait

In the name of Allah, the Most Gracious, the Most Merciful

Your checkout flow takes 12 seconds. Users abandon carts. You profile the code and discover the actual payment processing takes 800 milliseconds.

So where do the other 11 seconds go?

Sending confirmation email: 2 seconds
Generating PDF receipt: 3 seconds
Updating inventory in 3 systems: 2 seconds
Notifying analytics: 1 second
Sending webhook to partner: 2 seconds
Logging to audit system: 1 second

You stare at the code. The user is waiting... for an email they'll read in 10 minutes. They're waiting for analytics they'll never see. They're waiting for a PDF they might never download.

Why?

Quick Summary

Don't make users wait for things they don't need — the biggest latency wins come from moving work to background
Message queues are the bridge — they decouple "request" from "processing"
Plan for failure — workers crash, messages get redelivered, idempotency is essential

Want the full story? Keep reading.

This post is for you if:

Your API responses are slow because they do too much work inline
Users are waiting for operations they don't care about
You want to understand message queues without the enterprise jargon
You're building for scale and need to decouple services

The Synchronous Trap

Most developers write code that does things sequentially because that's how we think. Step 1, then step 2, then step 3.

The Synchronous Checkout: Every Step Blocks

Click

Payment 0.8s

Email 2s

PDF 3s

Inventory 2s

Analytics 4s

Success! 12.8s

User cares about this

User doesn't need to wait

The user waits 12.8 seconds, but only cares about the 0.8s payment result. Everything else can happen after they see "Success!"

Why do we write synchronous code by default?

The Restaurant Analogy

Imagine a restaurant where the waiter takes your order, walks to the kitchen, watches the chef cook, waits for the food, brings it to you, then takes the next order. Insane, right? But that's exactly how synchronous code works.

It's how we think

Humans process instructions sequentially. "Do A, then B, then C" maps directly to code.

It's easier to debug

Line 10 runs before line 11. Stack traces make sense. No race conditions.

"It works"

For small scale, sync code works fine. The problems appear at scale.

But the user only cares about step 2 (payment) and the final "Success!" message. Everything else can happen after they see the success screen.

The question that changes everything: What MUST happen now vs what can happen later?

What Must Be Synchronous vs What Can Be Async?

This is the key insight. Not everything needs to block the user.

The Critical Question: Does the User Need This NOW?

Must Be Synchronous

User needs the result to continue

Payment processing — did it work?

Authentication — can they enter?

Data validation — is input valid?

Core record creation — order exists

Can Be Asynchronous

User doesn't need to wait for this

Confirmation emails — read in 10 min

PDF generation — download later

Analytics — user never sees

Webhooks — partner can wait

Inventory sync — eventually consistent

The Rule

If the user doesn't need to see the result RIGHT NOW, it can be async.

Quick Check

A user uploads a profile photo. Which operation should be synchronous?

A) Generate 5 different thumbnail sizes

B) Save the original photo and return the URL

C) Update the user's profile in the search index

Exactly right

The user needs to know the upload succeeded. But thumbnail generation, search indexing, and other processing can happen in the background. Show a placeholder until thumbnails are ready.

Not quite

Thumbnail generation and search indexing don't block the user's next action. Only saving the original and confirming it worked needs to be synchronous. The rest can happen in background workers.

Part 2

The Solution: Message Queues

Message Queues: The Bridge Between "Request" and "Processing"

A message queue is a buffer between the code that creates work and the code that does the work. Think of it as a to-do list that multiple workers can pull from.

The Message Queue Architecture

Producer

Your web app

send

Message Queue

Queue

Messages wait here

receive

Workers

Process in parallel

The producer adds messages to the queue and immediately returns. Workers pull messages and process them whenever they're ready.

What exactly is a message queue?

The Restaurant Order System

Imagine a busy restaurant. The waiter doesn't cook the food — they write the order on a ticket and clip it to the order wheel. The kitchen picks up tickets and cooks them. The waiter is free to take more orders immediately.

That ticket wheel? That's a message queue. It decouples "taking orders" from "making food."

How It Actually Works

Producer creates a message (JSON with task data)

Message is added to queue (durable, persisted to disk)

Worker pulls next message from queue

Worker processes message, then acknowledges completion

Popular queue technologies:

Amazon SQS

Managed, infinite scale

RabbitMQ

Full-featured, complex routing

Redis

Simple, if you already use it

The Producer-Consumer Pattern

The beauty of message queues is separation of concerns. Your web request code becomes simple:

Before vs After: The Checkout Example

Before: Synchronous 12.8s

def checkout(cart):
    charge_payment(cart)      # 800ms
    send_email(cart)          # 2000ms
    generate_pdf(cart)        # 3000ms
    update_inventory(cart)    # 2000ms
    notify_analytics(cart)    # 1000ms
    send_webhook(cart)        # 2000ms
    log_audit(cart)           # 1000ms
    return "Success"          # 12.8s later

After: Async 0.9s

def checkout(cart):
    charge_payment(cart)      # 800ms
    order = create_order(cart) # 100ms

    # Fire and forget - user doesn't wait
    queue.send({
        "type": "order_completed",
        "order_id": order.id
    })

    return "Success"          # 0.9s

Meanwhile, background workers process the queue:

# Worker process (runs separately from web server)
def process_message(message):
    if message["type"] == "order_completed":
        send_email(message["order_id"])
        generate_pdf(message["order_id"])
        update_inventory(message["order_id"])
        # ... etc

The Result

User sees "Success!" in 0.9 seconds instead of 12.8 seconds. A 93% reduction in perceived latency. Same work gets done, but the user doesn't wait for it.

Related: Async + Caching

Async processing pairs beautifully with caching. You can use background workers to warm caches, regenerate expired data, and keep frequently-accessed content fresh — all without blocking user requests.

Don't Forget: Workers Need Connections Too

Your async workers still need database connections. With many workers processing in parallel, connection pools can exhaust quickly. Configure worker concurrency based on your connection limits, or you'll trade API timeouts for database connection errors.

Part 3

Making It Reliable

What If Workers Crash?

Moving work to background is great, but now we have new problems. What happens when things go wrong?

The Four Questions of Async Reliability

Worker crashes mid-processing?

Email was half-sent, then server died.

Solution: Don't acknowledge until done. Queue re-delivers after timeout.

Process same message twice?

User gets two confirmation emails.

Solution: Idempotency keys. Check if already processed.

Queue fills up?

Messages pile up, memory exhausted.

Solution: Backpressure, scaling workers, or dropping low-priority messages.

Processing always fails?

Bad data, infinite retry loop.

Solution: Dead Letter Queue after N retries. Alert humans.

Deep Dive: Dead Letter Queues (DLQ)

A Dead Letter Queue is where messages go to die — but in a controlled way. Instead of retrying forever, failed messages are moved aside for human review.

Main Queue

Worker tries

3 attempts

Success: Delete

Fail: to DLQ

Dead Letter Queue

Human reviews

This connects to failure handling patterns — retries with backoff, circuit breakers, and fallbacks all apply to async processing too.

Idempotency: Safe to Process Twice

Network issues, crashes, and retries mean the same message might be processed multiple times. Your code must handle this gracefully.

Without vs With Idempotency

Without Idempotency

1. Message: "Send welcome email"

2. Worker sends email

3. Worker crashes before ACK

4. Queue re-delivers message

5. Worker sends email AGAIN

User gets 2 emails!

With Idempotency Key

1. Message: "Send email, key=abc123"

2. Worker checks: "Did I process abc123?" No

3. Send email, record abc123 as done

4. Worker crashes before ACK

5. Queue re-delivers message

6. Worker checks: "Did I process abc123?" Yes → Skip

User gets 1 email!

How to implement idempotency

The simplest approach: store processed message IDs and check before processing.

# Using Redis for idempotency tracking
def process_order_email(message):
    idempotency_key = f"email:{message['order_id']}"

    # Check if already processed
    if redis.get(idempotency_key):
        print(f"Already sent email for {message['order_id']}")
        return  # Skip, acknowledge message

    # Actually send the email
    send_email(
        to=message['user_email'],
        subject="Order Confirmation",
        body=generate_email_body(message['order_id'])
    )

    # Mark as processed (with expiry)
    redis.set(idempotency_key, "done", ex=86400)  # 24h

Key points:

Generate consistent idempotency keys from message data
Check BEFORE doing the work, not after
Set expiry to avoid storing keys forever
Use atomic operations (Redis SET NX) to avoid race conditions

Backpressure: When Queues Overflow

What happens when producers create messages faster than consumers can process them? The queue grows. And grows. And eventually, something breaks.

The Backpressure Problem

Producers

1000 msgs/sec

QUEUE GROWING!

5,847

Queue

Consumer

100 msgs/sec

(slow!)

Add More Workers

Horizontal scaling. If one worker processes 100/sec, ten workers process 1000/sec.

Rate Limit Producers

Slow down input. Return 429 errors or queue locally until space opens.

Prioritize Messages

Process high-priority first. Analytics can wait; payment confirmations can't.

Drop Low-Priority

If acceptable, drop old analytics events. Not all messages are equal.

Quick Check

Your queue depth is growing steadily. Workers are at 95% CPU. What's the FIRST thing you should check?

A) Add more worker instances

B) Increase the queue size limit

C) Check why processing is slow (profiling)

Smart thinking

Before scaling horizontally, understand WHY it's slow. Maybe there's a slow database query, an inefficient loop, or an external API bottleneck. Scaling bad code just costs more money.

Common mistake

Adding workers or increasing limits is treating the symptom, not the cause. First profile your worker code to understand why it's slow. You might find a simple fix that's cheaper than scaling.

Part 4

Practice & Reference

Practice Mode: Test Your Understanding

Async Processing Scenarios

Score: 0/3

Scenario 1 of 3

You're building a video upload feature. When a user uploads a video, you need to:

Save the original file to storage
Generate 4 different quality versions (480p, 720p, 1080p, 4K)
Extract a thumbnail every 10 seconds
Run content moderation AI
Update the database with video metadata

What should happen synchronously during the upload request?

Everything — user should see all versions before the page returns

Save original + database metadata, then return. Queue the rest.

Nothing synchronous — accept the upload and process everything async

Exactly right!

The user needs confirmation that their upload succeeded (original saved) and that it exists in the system (database record). Everything else — transcoding, thumbnails, moderation — can happen in background workers. Show "Processing..." status on the video until ready.

Scenario 2 of 3

Your email worker is getting duplicate messages due to network issues. Users are receiving the same order confirmation email 2-3 times. You need to implement idempotency.

What's the best idempotency key for order confirmation emails?

The user's email address

A random UUID generated when creating the message

A combination: order_confirmation:{order_id}

Perfect!

The idempotency key must be derived from the business operation, not randomly generated. order_confirmation:{order_id} ensures that no matter how many times the message is retried, each order only gets one confirmation email. Using email address would block ALL emails to that user; using random UUID defeats the purpose.

Scenario 3 of 3

Your Dead Letter Queue has 500 messages that failed after 3 retries each. Investigation shows they all have the same error: ExternalAPIUnavailable: Partner webhook endpoint returned 503. The partner's API has been down for 2 hours.

What's the best course of action?

Delete the messages — they failed, move on

Wait for partner to recover, then replay the DLQ messages

Immediately move all messages back to the main queue

Smart approach!

This is a transient failure — the partner will come back. DLQ messages should be replayed once the root cause is resolved. Deleting loses data; immediately replaying wastes resources. Wait for recovery, then replay in batches with monitoring.

Cheat Sheet

Async Processing Quick Reference

When to Go Async

Operation takes > 100ms
User doesn't need result immediately
Calling unreliable external services
Can tolerate eventual consistency
Notifications, reports, cleanup tasks

Reliability Checklist

Acknowledgment: Only after success
Idempotency: Safe to process twice
Dead Letter Queue: After N failures
Monitoring: Queue depth, lag, DLQ size
Alerting: Growing queues = problem

Tool Selection

SQS: Managed, infinite scale, simple
RabbitMQ: Complex routing, exchanges
Redis: Already using it? Good enough
Kafka: Massive scale, event streaming
Start simple (SQS/Redis), scale later

Decision Framework: Should This Be Async?

Does the user need to see the result NOW?

Yes → Keep synchronous. No → Continue to step 2.

Does it take more than 100ms?

No → Probably fine synchronous. Yes → Continue to step 3.

Can it fail without the user knowing?

No → Needs careful error handling (DLQ, alerts). Yes → Great async candidate!

Result: Move it to a background queue!

Add idempotency, set up monitoring, configure DLQ.

What To Do Monday

If you're just starting: Use SQS or Redis. Don't start with Kafka unless you need millions of events.
If you have slow endpoints: Profile them. Find the sync operations that don't need to block. Move them to a queue.
If you're already using queues: Check your monitoring. Is queue depth stable? Are DLQ messages piling up? Are workers healthy?

Essential Reading: Failure Handling

Async systems fail differently than sync ones. Failure Handling covers the patterns you need: retries with exponential backoff, circuit breakers to prevent cascade failures, and graceful degradation. These apply directly to your async workers and queues.

What to Read Next

The 95% Problem: Understanding DB Connections

Workers need database connections too — don't exhaust your pool

Caching Strategies: Beyond Simple Key-Value

Use async workers to warm caches and keep data fresh

Failure Handling: Timeouts, Retries & Circuit Breakers

Handle failures gracefully in distributed systems

Was this helpful?

Your feedback helps improve future posts

What could be better? (optional)

Discussion

Be the first to comment!