Skip to content

janbalangue/async-bulkhead-fetch

Repository files navigation

async-bulkhead-fetch

Fail-fast admission control for outbound fetch calls, built on async-bulkhead-ts.

async-bulkhead-fetch protects expensive or fragile downstream HTTP dependencies by limiting how many calls are allowed in flight at once. When protected capacity is full, calls reject early instead of silently piling up latency.

What this package is

  • Admission control for outbound fetch calls
  • Local overload isolation for downstream HTTP dependencies
  • Shared-capacity protection for related external or internal services
  • Fail-fast rejection when protected capacity is full
  • Optional bounded queueing with queue wait timeouts
  • Response-body-aware capacity release
  • Fetch-friendly hooks, labels, metadata, stats, close(), and drain()

What this package is not

  • A rate limiter
  • A retry library
  • A circuit breaker
  • A request timeout library
  • A caching layer
  • A distributed quota or cross-instance coordination system

Install

npm i async-bulkhead-fetch

Node.js 20+ is supported. The package uses the standard fetch API by default. You can pass a custom fetch implementation for tests or specialized runtimes.

Quick start

import { createBulkheadFetch } from 'async-bulkhead-fetch';

const guardedFetch = createBulkheadFetch({
  name: 'payments-api',
  maxConcurrent: 10,
  maxQueue: 0,
});

const response = await guardedFetch('https://payments.example.com/charge', {
  method: 'POST',
  body: JSON.stringify(payload),
  headers: { 'content-type': 'application/json' },
});

const data = await response.json();

When capacity is full, the wrapper throws FetchBulkheadRejectedError before calling the downstream service.

import { FetchBulkheadRejectedError } from 'async-bulkhead-fetch';

try {
  const response = await guardedFetch('https://api.example.com/search');
  return await response.json();
} catch (err) {
  if (err instanceof FetchBulkheadRejectedError) {
    // err.reason:
    // 'concurrency_limit' | 'queue_limit' | 'timeout'
    // | 'aborted' | 'shutdown'
    return fallbackOr503(err.reason);
  }
  throw err;
}

Reusable bulkhead object

Use createFetchBulkhead() when you need shared capacity, stats, or graceful shutdown.

import { createFetchBulkhead } from 'async-bulkhead-fetch';

const searchApi = createFetchBulkhead({
  name: 'search-api',
  maxConcurrent: 20,
  maxQueue: 5,
  queueWaitTimeoutMs: 100,
});

export async function search(query: string) {
  const response = await searchApi.fetch(
    `https://search.example.com?q=${encodeURIComponent(query)}`,
  );
  return response.json();
}

// In your SIGTERM handler:
searchApi.close();
await searchApi.drain();

The returned object exposes:

  • fetch(input, init?, options?)
  • stats()
  • close()
  • drain()

Shared capacity for related dependencies

One bulkhead can protect a whole dependency, even if many call sites use it.

const stripe = createFetchBulkhead({
  name: 'stripe',
  maxConcurrent: 8,
  maxQueue: 0,
});

await stripe.fetch('https://api.stripe.com/v1/payment_intents', init);
await stripe.fetch('https://api.stripe.com/v1/refunds', init);

Both calls contend for the same local capacity pool.

Bounded queueing

Queueing is opt-in and bounded.

const analytics = createFetchBulkhead({
  name: 'analytics-api',
  maxConcurrent: 4,
  maxQueue: 8,
  queueWaitTimeoutMs: 250,
});

Semantics:

  • If inFlight < maxConcurrent, the call starts immediately.
  • Else if maxQueue has room, the call waits FIFO.
  • Else the call rejects immediately.
  • queueWaitTimeoutMs applies only while waiting for admission.
  • It does not timeout the downstream HTTP request.

Use an AbortSignal to cancel queued admission and the downstream fetch:

const ac = new AbortController();

const response = await guardedFetch(
  'https://api.example.com/items',
  { signal: ac.signal },
);

Queue wait timeouts and request timeouts are separate. Compose a downstream request timeout with AbortSignal.timeout() or your existing timeout policy:

const response = await guardedFetch(
  'https://api.example.com/items',
  { signal: AbortSignal.timeout(2_000) },
  { queueWaitTimeoutMs: 50 },
);

Capacity release behavior

By default, async-bulkhead-fetch holds capacity until the response body is consumed, cancelled, errors, or the caller's AbortSignal aborts after headers have been received.

const response = await guardedFetch('https://api.example.com/items');

// Capacity is still held here by default.
const data = await response.json();

// Capacity is released after json() resolves or rejects.

This protects the full outbound HTTP lifecycle for normal API calls. If you do not intend to read the body, cancel it:

const response = await guardedFetch('https://api.example.com/fire-and-forget');
await response.body?.cancel();

Important: in default body-scoped mode, unread response bodies hold capacity. Always consume or cancel the body, or use releaseOn: 'headers' for intentionally header-only calls.

Cloned responses share the same admission token. Capacity is released only after every original or cloned response body branch that was created has been consumed, cancelled, or errored.

const response = await guardedFetch('https://api.example.com/items');
const clone = response.clone();

await response.json(); // capacity is still held because clone is unread
await clone.text();    // capacity can now be released

Or opt into header-scoped release:

const guardedFetch = createBulkheadFetch({
  name: 'headers-only-api',
  maxConcurrent: 10,
  releaseOn: 'headers',
});

releaseOn: 'headers' releases capacity as soon as fetch() resolves with response headers. This avoids holding capacity for callers that intentionally do not consume response bodies, but it does not protect response body streaming time.

You can override release behavior per call:

await guardedFetch(url, init, { releaseOn: 'headers' });

Observability hooks

Use low-cardinality labels for metrics and metadata for request-scoped logs/traces.

const github = createFetchBulkhead({
  name: 'github-api',
  label: 'github',
  maxConcurrent: 6,
  maxQueue: 0,
  metadata: (_input, init) => ({ method: init?.method ?? 'GET' }),
  onReject(event) {
    metrics.increment('bulkhead.fetch.reject', {
      bulkhead: event.name,
      dependency: event.label,
      reason: event.reason,
    });
  },
  onRelease(event) {
    metrics.gauge('bulkhead.fetch.in_flight', event.inFlight, {
      bulkhead: event.name,
      dependency: event.label,
    });
  },
});

Hooks are fire-and-forget. Synchronous hook exceptions and asynchronous hook rejections are swallowed and counted in stats().hookErrors.

API

createBulkheadFetch(options)

Creates a callable fetch wrapper backed by an internal bulkhead instance.

const guardedFetch = createBulkheadFetch({
  name: 'openai',
  maxConcurrent: 10,
});

createFetchBulkhead(options)

Creates a reusable wrapper with explicit lifecycle methods.

const bulkhead = createFetchBulkhead({
  name: 'openai',
  maxConcurrent: 10,
});

await bulkhead.fetch(url, init);
bulkhead.stats();
bulkhead.close();
await bulkhead.drain();

FetchBulkheadOptions

type FetchBulkheadOptions = {
  name?: string;
  maxConcurrent: number;
  maxQueue?: number;
  queueWaitTimeoutMs?: number; // admission wait timeout only
  fetch?: typeof fetch;
  releaseOn?: 'body' | 'headers'; // validated at runtime
  label?: string | ((input: RequestInfo | URL, init?: RequestInit) => string | undefined);
  metadata?: (input: RequestInfo | URL, init?: RequestInit) => Record<string, unknown> | undefined;
  onAdmit?: (event: FetchBulkheadEvent) => void | Promise<void>;
  onReject?: (event: FetchBulkheadRejectEvent) => void | Promise<void>;
  onRelease?: (event: FetchBulkheadReleaseEvent) => void | Promise<void>;
};

Per-call options

type FetchBulkheadRequestOptions = {
  queueWaitTimeoutMs?: number;
  signal?: AbortSignal;
  releaseOn?: 'body' | 'headers'; // validated at runtime
  label?: string;
  metadata?: Record<string, unknown>;
};

The third argument is intentionally separate from standard fetch(input, init) options.

await bulkhead.fetch(url, init, {
  queueWaitTimeoutMs: 50,
  releaseOn: 'headers',
  label: 'search-api',
});

Rejection error

class FetchBulkheadRejectedError extends Error {
  readonly code = 'FETCH_BULKHEAD_REJECTED';
  readonly reason:
    | 'concurrency_limit'
    | 'queue_limit'
    | 'timeout'
    | 'aborted'
    | 'shutdown';
}

Stats

const s = bulkhead.stats();

s.inFlight;
s.pending;
s.maxConcurrent;
s.maxQueue;
s.closed;
s.totalAdmitted;
s.totalReleased;
s.rejected;
s.rejectedByReason;
s.aborted;
s.timedOut;
s.doubleRelease;
s.inFlightUnderflow;
s.hookErrors;

stats() is a pure read.

Lifecycle behavior

  • maxConcurrent must be a positive integer.
  • maxQueue must be a non-negative integer.
  • queueWaitTimeoutMs, when set, must be a finite number greater than or equal to zero.
  • Admission happens before the downstream fetch implementation is called.
  • Rejected calls do not call the downstream service.
  • maxQueue: 0 gives fail-fast behavior with no queueing.
  • maxQueue > 0 enables bounded FIFO waiting.
  • Caller aborts can cancel queued admission.
  • Caller aborts are forwarded to the underlying fetch call after admission.
  • Caller aborts after headers release capacity in default body-scoped mode.
  • Capacity is always released on fetch error.
  • In default releaseOn: 'body' mode, capacity is released when all created body branches are consumed, cancelled, or error out.
  • In releaseOn: 'headers' mode, capacity is released when fetch() resolves.
  • Duplicate body lifecycle paths do not double-release capacity.
  • releaseOn values are validated at runtime.

Deployment guidance

Bulkheads are local to the Node.js process that creates them. If an app runs four worker processes, containers, or pods, each process gets its own independent maxConcurrent and maxQueue capacity. For example, maxConcurrent: 10 on four pods can allow up to forty concurrent protected outbound calls across the deployment.

Pick initial values from the constrained dependency you are protecting, such as provider concurrency, DB-adjacent APIs, internal service capacity, or expensive fan-out. Start with maxQueue: 0 for fail-fast behavior when latency matters. Use a small queue only when brief bursts are normal and a bounded wait is better than immediate rejection.

Keep metric labels low-cardinality. Prefer labels like stripe, openai, search-api, or payments-service. Avoid raw URLs, user IDs, request IDs, tenant IDs, and query strings as metric labels.

Design notes

This package composes around async-bulkhead-ts. It does not replace higher-level concerns:

  • Retries — compose retry behavior outside the bulkhead.
  • Request timeouts — use AbortController or your existing HTTP timeout policy.
  • Rate limits — use provider-aware throttling or token buckets when you need rate semantics.
  • Distributed quotas — use a shared control plane or distributed limiter when you need cross-instance coordination.

Rule of thumb:

If you want to eventually send every HTTP call, use a queue.
If you want to protect your service and downstream dependency under overload, use a bulkhead.

License

Apache-2.0. See LICENSE.

Development

npm test
npm run verify