API INTEGRATION PATTERNS
Integrating with external APIs is one of the most common tasks in modern web development. It's also one of the easiest places to write code that works fine in development but falls apart in production. Network calls fail. Rate limits get hit. Responses change shape unexpectedly. The patterns in this article will help you write API integrations that are resilient, observable, and easy to maintain.
BUILDING A TYPED API CLIENT
Before reaching for patterns like circuit breakers, get the foundation right: a centralized API client. Scattering fetch calls throughout your codebase makes it nearly impossible to add cross-cutting behavior like auth, logging, or retries later.
The idea is a class called ApiClient that takes a base URL and an API key in its constructor. It stores default headers including Content-Type and an Authorization Bearer token, then exposes a single generic request method. That method accepts a path and optional fetch options, sets up an AbortController tied to a configurable timeout (defaulting to 10 seconds), and fires the request. If the response comes back with a non-OK status it throws a custom ApiError. Otherwise it parses the JSON, wraps the result in an ApiResponse object containing the data, status code, and headers, and returns it.
This gives you one place to attach auth headers, enforce timeouts, and later bolt on all the more sophisticated patterns described below.
ERROR HANDLING
The first instinct is to wrap everything in a try/catch. That is necessary but not sufficient. You need to distinguish between different kinds of failures because they require different responses.
A custom ApiError class extends the built-in Error and stores the HTTP status code alongside the raw response body. It exposes two computed properties: isRetryable, which returns true for 429 rate-limit responses and any 5xx server error, and isAuthError, which returns true for 401 and 403. With those helpers in place, a fetchUser function can make smart decisions at the catch site. An auth error triggers a token refresh. A 404 returns null instead of throwing, because not finding a resource is a perfectly normal outcome that should not be modeled as an exception. A retryable error gets re-thrown so an upper retry layer can handle it.
The key principle is that 404 is not an exception. It is a valid response that means the resource does not exist. Throwing on 404 forces callers to use try/catch for ordinary control flow, which is an antipattern. Return null or a typed "not found" value instead.
RETRY WITH EXPONENTIAL BACKOFF
Transient failures — a server hiccup, a brief network blip, a momentary rate limit — are normal at scale. Instead of immediately surfacing these to users, retry with exponential backoff.
A withRetry wrapper accepts a function to call and a RetryOptions object with fields for maxAttempts, baseDelayMs, maxDelayMs, and an optional jitter flag. It loops up to maxAttempts times. On each failure it checks whether the error is retryable and whether attempts remain. If both are true it calculates a delay: base delay multiplied by two raised to the attempt index, capped at maxDelayMs. When jitter is enabled (the default) the delay is randomized within a 50 to 100 percent range of that cap.
The jitter is important. Without it, if your entire fleet of servers hits the same transient error simultaneously, they will all retry at the same intervals and create a thundering herd that makes the downstream service's situation worse. Adding randomness spreads the retry load across time and gives the service breathing room to recover.
CIRCUIT BREAKER PATTERN
Retries solve transient errors. But what if an API is down for an extended period? Retrying repeatedly wastes resources and makes your users wait. The circuit breaker pattern stops calling a failing service and gives it time to recover.
A CircuitBreaker class tracks three states: closed for normal operation, open for fast-failing without hitting the network, and half-open for testing whether the service has recovered. It maintains a failure count and a timestamp of the last failure. Every call goes through an execute method. If the circuit is open, it checks whether enough time has passed since the last failure. If not, it throws immediately without touching the API. If the reset timeout has elapsed, it transitions to half-open and allows one request through. A successful request resets the failure count and closes the circuit. A failure increments the count, records the time, and opens the circuit again if the failure threshold is crossed.
The threshold and reset timeout are configurable. A reasonable starting point for most third-party integrations is five failures to open and thirty seconds to attempt recovery.
RATE LIMITING WITH TOKEN BUCKET
Most APIs impose rate limits. Rather than hitting the limit and receiving 429 errors, self-impose a client-side rate limiter using the token bucket algorithm. The bucket refills at a fixed rate and each request consumes one token.
A TokenBucket class takes a capacity representing the maximum burst size and a refill rate measured in tokens per second. Its consume method first calculates how much time has elapsed since the last refill and adds the proportional number of tokens back, capped at capacity. If enough tokens are available, it deducts one and returns immediately. If not, it calculates how long to wait for the deficit to refill, waits that duration, then proceeds.
For an API with a documented limit of ten requests per second, a bucket with capacity twenty and refill rate ten allows short bursts while staying within the sustained limit over time. The bucket smooths out traffic automatically — callers do not need to think about pacing.
CACHING WITH STALE-WHILE-REVALIDATE
The stale-while-revalidate strategy serves cached data immediately while refreshing it in the background. It is perfect for data that changes infrequently but where freshness still matters.
An SWRCache class holds two maps internally: one for cache entries (each storing the data, the time it was fetched, and its TTL), and one tracking in-flight requests. The get method accepts a cache key, a fetcher function, and a TTL. If a fresh entry exists it returns immediately with no network call. If the entry is stale, it kicks off a background revalidation and returns the stale data right away so the user sees something instantly. If there is no entry at all, it fetches and waits for the result.
The in-flight deduplication is subtle but crucial. Without it, ten components calling getUser("123") before the first response arrives would fire ten identical API requests. By storing the in-progress Promise and returning it to any subsequent caller for the same key, only one request ever goes out no matter how many callers pile on.
REQUEST AND RESPONSE INTERCEPTORS
Interceptors let you apply cross-cutting logic such as logging, token refresh, and error normalization without cluttering individual call sites.
An InterceptableClient extends the base ApiClient and maintains two arrays of interceptor functions: one for requests and one for responses. A use method registers a pair. Before each request, the client pipes the options through all registered request interceptors in sequence, allowing each one to modify the config. After getting a response, it pipes the result through all response interceptors. This lets you register a development logger that prints every outgoing request, or an auth interceptor that automatically calls refreshToken when it sees a 401 come back, all without touching the actual fetch call sites scattered through your application.
COMPOSING ALL THE PATTERNS
In production you layer these patterns rather than choosing between them. A typical stack looks like this: every outgoing request first passes through the rate limiter to respect API quotas. It then enters the withRetry wrapper, which internally calls the circuit breaker's execute method, which finally calls the underlying API client. For read-heavy endpoints, a top-level SWR cache wraps the whole thing so repeat calls skip the network entirely when data is still fresh.
Here is a quick decision guide for when to reach for each tool:
- Retries — use for all 5xx errors and network timeouts
- Circuit breaker — use for third-party services you do not control
- Rate limiting — use whenever the API has documented limits, and even when it does not
- SWR caching — use for read-heavy data that can tolerate brief staleness
- Interceptors — use for auth, logging, and error normalization
The goal is integrations that fail gracefully, recover automatically, and give you the visibility you need when something does go wrong.
Back to Methods
MethodIntermediate
API Integration Patterns
Learn the best patterns for integrating third-party APIs including error handling, rate limiting, and caching strategies.
January 20, 202415 min read
APIBest PracticesError Handling