Fail-fast admission control for outbound fetch calls, built on async-bulkhead-ts.
async-bulkhead-fetch protects expensive or fragile downstream HTTP dependencies by limiting how many calls are allowed in flight at once. When protected capacity is full, calls reject early instead of silently piling up latency.
- Admission control for outbound
fetchcalls - Local overload isolation for downstream HTTP dependencies
- Shared-capacity protection for related external or internal services
- Fail-fast rejection when protected capacity is full
- Optional bounded queueing with queue wait timeouts
- Response-body-aware capacity release
- Fetch-friendly hooks, labels, metadata, stats,
close(), anddrain()
- A rate limiter
- A retry library
- A circuit breaker
- A request timeout library
- A caching layer
- A distributed quota or cross-instance coordination system
npm i async-bulkhead-fetchNode.js 20+ is supported. The package uses the standard fetch API by default. You can pass a custom fetch implementation for tests or specialized runtimes.
import { createBulkheadFetch } from 'async-bulkhead-fetch';
const guardedFetch = createBulkheadFetch({
name: 'payments-api',
maxConcurrent: 10,
maxQueue: 0,
});
const response = await guardedFetch('https://payments.example.com/charge', {
method: 'POST',
body: JSON.stringify(payload),
headers: { 'content-type': 'application/json' },
});
const data = await response.json();When capacity is full, the wrapper throws FetchBulkheadRejectedError before calling the downstream service.
import { FetchBulkheadRejectedError } from 'async-bulkhead-fetch';
try {
const response = await guardedFetch('https://api.example.com/search');
return await response.json();
} catch (err) {
if (err instanceof FetchBulkheadRejectedError) {
// err.reason:
// 'concurrency_limit' | 'queue_limit' | 'timeout'
// | 'aborted' | 'shutdown'
return fallbackOr503(err.reason);
}
throw err;
}Use createFetchBulkhead() when you need shared capacity, stats, or graceful shutdown.
import { createFetchBulkhead } from 'async-bulkhead-fetch';
const searchApi = createFetchBulkhead({
name: 'search-api',
maxConcurrent: 20,
maxQueue: 5,
queueWaitTimeoutMs: 100,
});
export async function search(query: string) {
const response = await searchApi.fetch(
`https://search.example.com?q=${encodeURIComponent(query)}`,
);
return response.json();
}
// In your SIGTERM handler:
searchApi.close();
await searchApi.drain();The returned object exposes:
fetch(input, init?, options?)stats()close()drain()
One bulkhead can protect a whole dependency, even if many call sites use it.
const stripe = createFetchBulkhead({
name: 'stripe',
maxConcurrent: 8,
maxQueue: 0,
});
await stripe.fetch('https://api.stripe.com/v1/payment_intents', init);
await stripe.fetch('https://api.stripe.com/v1/refunds', init);Both calls contend for the same local capacity pool.
Queueing is opt-in and bounded.
const analytics = createFetchBulkhead({
name: 'analytics-api',
maxConcurrent: 4,
maxQueue: 8,
queueWaitTimeoutMs: 250,
});Semantics:
- If
inFlight < maxConcurrent, the call starts immediately. - Else if
maxQueuehas room, the call waits FIFO. - Else the call rejects immediately.
queueWaitTimeoutMsapplies only while waiting for admission.- It does not timeout the downstream HTTP request.
Use an AbortSignal to cancel queued admission and the downstream fetch:
const ac = new AbortController();
const response = await guardedFetch(
'https://api.example.com/items',
{ signal: ac.signal },
);Queue wait timeouts and request timeouts are separate. Compose a downstream request timeout with AbortSignal.timeout() or your existing timeout policy:
const response = await guardedFetch(
'https://api.example.com/items',
{ signal: AbortSignal.timeout(2_000) },
{ queueWaitTimeoutMs: 50 },
);By default, async-bulkhead-fetch holds capacity until the response body is consumed, cancelled, errors, or the caller's AbortSignal aborts after headers have been received.
const response = await guardedFetch('https://api.example.com/items');
// Capacity is still held here by default.
const data = await response.json();
// Capacity is released after json() resolves or rejects.This protects the full outbound HTTP lifecycle for normal API calls. If you do not intend to read the body, cancel it:
const response = await guardedFetch('https://api.example.com/fire-and-forget');
await response.body?.cancel();Important: in default body-scoped mode, unread response bodies hold capacity. Always consume or cancel the body, or use
releaseOn: 'headers'for intentionally header-only calls.
Cloned responses share the same admission token. Capacity is released only after every original or cloned response body branch that was created has been consumed, cancelled, or errored.
const response = await guardedFetch('https://api.example.com/items');
const clone = response.clone();
await response.json(); // capacity is still held because clone is unread
await clone.text(); // capacity can now be releasedOr opt into header-scoped release:
const guardedFetch = createBulkheadFetch({
name: 'headers-only-api',
maxConcurrent: 10,
releaseOn: 'headers',
});releaseOn: 'headers' releases capacity as soon as fetch() resolves with response headers. This avoids holding capacity for callers that intentionally do not consume response bodies, but it does not protect response body streaming time.
You can override release behavior per call:
await guardedFetch(url, init, { releaseOn: 'headers' });Use low-cardinality labels for metrics and metadata for request-scoped logs/traces.
const github = createFetchBulkhead({
name: 'github-api',
label: 'github',
maxConcurrent: 6,
maxQueue: 0,
metadata: (_input, init) => ({ method: init?.method ?? 'GET' }),
onReject(event) {
metrics.increment('bulkhead.fetch.reject', {
bulkhead: event.name,
dependency: event.label,
reason: event.reason,
});
},
onRelease(event) {
metrics.gauge('bulkhead.fetch.in_flight', event.inFlight, {
bulkhead: event.name,
dependency: event.label,
});
},
});Hooks are fire-and-forget. Synchronous hook exceptions and asynchronous hook rejections are swallowed and counted in stats().hookErrors.
Creates a callable fetch wrapper backed by an internal bulkhead instance.
const guardedFetch = createBulkheadFetch({
name: 'openai',
maxConcurrent: 10,
});Creates a reusable wrapper with explicit lifecycle methods.
const bulkhead = createFetchBulkhead({
name: 'openai',
maxConcurrent: 10,
});
await bulkhead.fetch(url, init);
bulkhead.stats();
bulkhead.close();
await bulkhead.drain();type FetchBulkheadOptions = {
name?: string;
maxConcurrent: number;
maxQueue?: number;
queueWaitTimeoutMs?: number; // admission wait timeout only
fetch?: typeof fetch;
releaseOn?: 'body' | 'headers'; // validated at runtime
label?: string | ((input: RequestInfo | URL, init?: RequestInit) => string | undefined);
metadata?: (input: RequestInfo | URL, init?: RequestInit) => Record<string, unknown> | undefined;
onAdmit?: (event: FetchBulkheadEvent) => void | Promise<void>;
onReject?: (event: FetchBulkheadRejectEvent) => void | Promise<void>;
onRelease?: (event: FetchBulkheadReleaseEvent) => void | Promise<void>;
};type FetchBulkheadRequestOptions = {
queueWaitTimeoutMs?: number;
signal?: AbortSignal;
releaseOn?: 'body' | 'headers'; // validated at runtime
label?: string;
metadata?: Record<string, unknown>;
};The third argument is intentionally separate from standard fetch(input, init) options.
await bulkhead.fetch(url, init, {
queueWaitTimeoutMs: 50,
releaseOn: 'headers',
label: 'search-api',
});class FetchBulkheadRejectedError extends Error {
readonly code = 'FETCH_BULKHEAD_REJECTED';
readonly reason:
| 'concurrency_limit'
| 'queue_limit'
| 'timeout'
| 'aborted'
| 'shutdown';
}const s = bulkhead.stats();
s.inFlight;
s.pending;
s.maxConcurrent;
s.maxQueue;
s.closed;
s.totalAdmitted;
s.totalReleased;
s.rejected;
s.rejectedByReason;
s.aborted;
s.timedOut;
s.doubleRelease;
s.inFlightUnderflow;
s.hookErrors;stats() is a pure read.
maxConcurrentmust be a positive integer.maxQueuemust be a non-negative integer.queueWaitTimeoutMs, when set, must be a finite number greater than or equal to zero.- Admission happens before the downstream
fetchimplementation is called. - Rejected calls do not call the downstream service.
maxQueue: 0gives fail-fast behavior with no queueing.maxQueue > 0enables bounded FIFO waiting.- Caller aborts can cancel queued admission.
- Caller aborts are forwarded to the underlying fetch call after admission.
- Caller aborts after headers release capacity in default body-scoped mode.
- Capacity is always released on fetch error.
- In default
releaseOn: 'body'mode, capacity is released when all created body branches are consumed, cancelled, or error out. - In
releaseOn: 'headers'mode, capacity is released whenfetch()resolves. - Duplicate body lifecycle paths do not double-release capacity.
releaseOnvalues are validated at runtime.
Bulkheads are local to the Node.js process that creates them. If an app runs four worker processes, containers, or pods, each process gets its own independent maxConcurrent and maxQueue capacity. For example, maxConcurrent: 10 on four pods can allow up to forty concurrent protected outbound calls across the deployment.
Pick initial values from the constrained dependency you are protecting, such as provider concurrency, DB-adjacent APIs, internal service capacity, or expensive fan-out. Start with maxQueue: 0 for fail-fast behavior when latency matters. Use a small queue only when brief bursts are normal and a bounded wait is better than immediate rejection.
Keep metric labels low-cardinality. Prefer labels like stripe, openai, search-api, or payments-service. Avoid raw URLs, user IDs, request IDs, tenant IDs, and query strings as metric labels.
This package composes around async-bulkhead-ts. It does not replace higher-level concerns:
- Retries — compose retry behavior outside the bulkhead.
- Request timeouts — use
AbortControlleror your existing HTTP timeout policy. - Rate limits — use provider-aware throttling or token buckets when you need rate semantics.
- Distributed quotas — use a shared control plane or distributed limiter when you need cross-instance coordination.
Rule of thumb:
If you want to eventually send every HTTP call, use a queue.
If you want to protect your service and downstream dependency under overload, use a bulkhead.
Apache-2.0. See LICENSE.
npm test
npm run verify