Decorative header image for Asynchronous API Pattern Simulator | Gray-wolf Tools

Asynchronous API Pattern Simulator

Explore how synchronous request handling can trigger cascading failures and how an asynchronous request-reply pattern with queuing, TTL, and backpressure keeps services responsive. Visualize load distribution, queue states, and downstream worker activity in real time using generic gateway and backend services.

Scenario Summary

A sudden burst of 514 simultaneous requests hits a public BFF that proxies to a slow downstream backend service. The synchronous BFF ties up its worker pool waiting on the backend and begins failing health checks. By contrast, an asynchronous BFF queues work, applies a queue TTL, and keeps responding instantly to clients.

Sample A: Thread-per-Request API Gateway (12s downstream latency)

The synchronous API gateway keeps a worker from its thread pool allocated for the entire HTTP request/response cycle. The event-driven gateway acknowledges immediately and defers work to background processors.

Synchronous BFF (Thread-per-Request)

Client

Idle

BFF

Idle

BE Service

Idle

Result

Press "Run Sample" to observe HTTP 200 vs 202 behaviour.

Call Start
--
Gateway Hold Time
--
UI Blocked?
--

Architect's Note

Because the backend responds within 30 seconds, the synchronous gateway returns HTTP 200. The trade-off is that the thread pool worker remains blocked, preventing it from serving other clients.

Asynchronous BFF (202 Accepted + Event Processor)

Client

Idle

BFF

Idle

BE Service

Idle

Result

Press "Run Sample" to see HTTP 202 Accepted followed by a final HTTP 200 when polling completes.

Job Accepted
--
Client Waiting for Result
--
UI Blocked?
--

Architect's Note

The ingress layer emits to a message broker, responds with HTTP 202, and a worker service completes the job asynchronously. No client threads are blocked.

Sample B: Backend exceeds timeout budget

Here the synchronous API gateway hits its configured 30s upstream timeout and surfaces HTTP 504 Gateway Timeout, while the asynchronous workflow still delivers a result via polling.

Synchronous BFF (Thread-per-Request)

Client

Idle

BFF

Idle

BE Service

Idle

Result

Press "Run Sample" to see HTTP 504 Gateway Timeout triggered by thread pool starvation.

Call Start
--
Gateway Hold Time
--
UI Blocked?
--

Architect's Note

Once the upstream timeout elapses, the gateway aborts the socket and returns HTTP 504. The backend may still complete, but the client connection has already failed.

Resilient Approach: Asynchronous Workflow

Client

Idle

BFF

Idle

BE Service

Idle

Result

Press "Run Sample" to watch HTTP 202 followed by a 200 poll result, even though processing exceeds 30 seconds.

Job Accepted
--
Client Waiting for Result
--
UI Blocked?
--

Architect's Note

Decoupling via a queue means the ingress tier is free after returning 202. Long-running processing happens inside the asynchronous worker pool and clients poll for status.

Load Simulation: Burst of 150 Calls

Watch how synchronous and asynchronous gateway patterns behave when 150 requests arrive over a few seconds. Adjust the queue TTL to model when outdated work should be discarded.

Synchronous BFF Under Load

Thread Pool Threads (Capacity: 40)

Backend Worker Status (15 Workers)

Thread Busy (blocking upstream HTTP 200)
Thread Timed Out (HTTP 504)
Request Rejected (HTTP 503 ThreadPool Exhausted)
Thread Released (resource reclaimed)

Each square represents a thread inside the gateway's worker pool. When all squares are amber, the thread pool is saturated and new requests are refused.

Status: Idle

Total Time: 0s

Completed (HTTP 200): 0 | Timeouts (HTTP 504): 0 | Rejected (HTTP 503): 0

Simulation Analysis

Gateway thread pool capacity saturates quickly. Once all 40 threads are blocked waiting on the backend, any new call is rejected immediately with 503 Service Unavailable. Threads that remain blocked for 30 seconds flip to 504 Gateway Timeout, mirroring the incident's production behaviour.

Asynchronous BFF Under Load

Gateway Request Status

Backend Worker Status (15 Workers)

Queue Depth

0

requests waiting

Queued / Processing (202 in flight)
Completed (polled 200)
Expired (TTL → HTTP 410 Gone)
Rejected (Queue Full → HTTP 429/503)

Status: Idle

Total Time: 0s

Completed (poll 200): 0 | Expired (HTTP 410): 0 | Rejected (HTTP 429/503): 0

Simulation Analysis

The asynchronous gateway queues surplus work and keeps responding instantly. If the queue limit or TTL is reached, requests are declined or expired immediately, providing backpressure without overwhelming workers.

Architecture & Incident Glossary

  • Thread Pool Exhaustion: When all worker threads in the synchronous gateway are blocked on upstream calls, forcing new requests to fail fast with 503 Service Unavailable.
  • Ingress / API Gateway: The edge tier handling client HTTP traffic. In the incident this was the service whose thread pool saturated.
  • Downstream Microservice: The synchronous dependency with elevated latency that held gateway threads for 30+ seconds.
  • Asynchronous Gateway Pattern: Converts client calls into background jobs (often via queue or event bus) and returns 202 Accepted to avoid blocking threads.
  • TTL (Time-to-Live): Maximum age for queued work. Expired jobs surface as 410 Gone responses in the async simulation.
  • Backpressure: Mechanisms (queue limits, fast failures) preventing overload from propagating downstream.
  • Observability Signals: 5xx error spikes (503/504), thread utilization metrics, and queue depth are key indicators highlighted in this simulator.

Asynchronous Request-Reply Pattern

Many digital products have journeys that depend on heavy downstream workflows—credit adjudication, large document generation, analytics or batch lookups—that routinely exceed 10–15 seconds. Keeping those calls synchronous between the client and BFF pins threads, amplifies load on shared infrastructure, and pushes users toward "spinner fatigue" followed by 504 Gateway Timeout errors when latency spikes. The remedy is to lean on the Asynchronous Request-Reply Pattern whenever you know a backend step is long-running, bursty, or expensive.

How It Works

  1. Initiate Job: The client UI issues a non-blocking call to the BFF (for example, POST /api/eligibility-jobs) to enqueue the heavy work.
  2. Poll for Result: The BFF responds immediately with a Job ID. The client polls GET /api/eligibility-jobs/{id} until the BE service marks the job complete.

Why We Recommend This Pattern

  • Guarantees UI responsiveness: The initial acknowledgement returns instantly. The UI can render local progress indicators while polling happens in the background.
  • Protects BFF resources: Connections are released immediately, preventing thread pool exhaustion and maintaining platform stability even when downstream services degrade.

Sequence Diagram

sequenceDiagram autonumber participant Client as Client UI participant BFF as BFF participant Queue as Message Broker / Queue participant Worker as BE Service Worker Client->>BFF: POST /api/eligibility-jobs BFF->>Queue: Enqueue Job Payload Queue-->>Worker: Deliver Job BFF-->>Client: 202 Accepted (jobId) loop Poll every N seconds Client->>BFF: GET /api/eligibility-jobs/{jobId} alt Job still processing BFF-->>Client: 200 OK (status=pending) else Job complete Worker-->>Queue: Mark Completed BFF-->>Client: 200 OK (result payload) Note over Client,Worker: UI updates with final result end end