feat: add outbound proxy support to @arcjet/transport and @arcjet/guard#6089
feat: add outbound proxy support to @arcjet/transport and @arcjet/guard#6089davidmytton wants to merge 24 commits into
Conversation
There was a problem hiding this comment.
Arcjet Review — 🔴 High Risk
Decision: Reviewers Assigned
Rationale: This PR changes the core outbound transport behavior for both @arcjet/transport and @arcjet/guard, adds proxy environment-variable detection, changes supported Node.js engine ranges, adds runtime entry points, and introduces new package dependencies. The implementation includes good test coverage for many NO_PROXY and HTTP proxy cases and avoids logging proxy credentials, but proxy auto-detection is security-sensitive and there is a potential HTTP_PROXY/httpoxy-style risk that should be reviewed by humans. No specific escalation reviewers are configured, so the decision is Reviewers Assigned.
Summary of Changes
Adds outbound proxy support using HTTP_PROXY/HTTPS_PROXY with NO_PROXY handling across Node.js, Bun, Deno, and guard transports; adds proxy detection utilities and tests; adds a Deno transport entry point; updates docs, Node engine requirements, type configuration, and transport dependencies.
Escalation Triggers
- Dependency Changes: package.json files were changed: @arcjet/transport adds @arcjet/env and @arcjet/logger dependencies and tightens Node engine support; @arcjet/guard updates @types/node and engine constraints.
Security Flags
- [MEDIUM] Proxy environment handling (transport/detect-proxy.ts:116): Uppercase HTTP_PROXY is honored by default for HTTP base URLs. In CGI-like or legacy server environments, inbound Proxy headers can be translated into HTTP_PROXY, allowing attacker-controlled outbound proxying unless an httpoxy mitigation is applied, such as ignoring uppercase HTTP_PROXY when REQUEST_METHOD is present.
Review Focus Areas
- Should uppercase HTTP_PROXY be ignored when REQUEST_METHOD is present, or should the package explicitly document that CGI-style environments are unsupported?
This is the standard mitigation for the httpoxy class of vulnerabilities when libraries automatically honor HTTP_PROXY. - If an httpoxy mitigation is added to @arcjet/transport, can the duplicated proxy-resolution logic in @arcjet/guard be updated in lockstep?
The PR intentionally duplicates proxy logic, so security fixes must remain synchronized between packages. - Please verify Deno users can call createTransport() without --allow-env, or handle permission errors when reading process.env.
The Deno entry point calls detectProxy by default, and environment access in Deno can be permission-gated; this could turn proxy auto-detection into a runtime failure for otherwise valid clients. - Can CI or tests exercise actual HTTPS-over-proxy behavior, not just construction of an HTTPS proxy transport?
The HTTP proxy integration test validates HTTP/1.1 proxying, but HTTPS proxying depends on Node Agent CONNECT behavior and is the likely production path for the Arcjet API. - Is dropping Node >=20 and Node 23 support acceptable for this package version, and is this covered in release notes as a breaking runtime-support change?
The package previously advertised node >=20 and now requires >=22.21.0 <23 || >=24.5.0, which can break existing consumers even when they do not use proxy support. - Please confirm the new @arcjet/env and @arcjet/logger dependencies are reflected in the lockfile/workspace metadata and have compatible licenses and acceptable transitive dependency impact.
New runtime dependencies affect supply-chain risk and install reproducibility.
Notes
The diff appears to exceed the 1000-line automated-review threshold, so human review is recommended even aside from the security-sensitive proxy behavior. No hardcoded secrets were identified, and the proxy URL is deliberately not logged.
Path filtering: 2 files excluded by ignore paths. 20 of 22 files included in review.
Review: 7d63a58a | Model: openai/gpt-5.5 | Powered by Arcjet Review
There was a problem hiding this comment.
Arcjet Review — 🔴 High Risk
Decision: Reviewers Assigned
Rationale: This PR changes core outbound transport behavior for @arcjet/transport and @arcjet/guard, adds proxy auto-detection across runtimes, changes Node.js engine requirements broadly across packages, adds runtime entry points, and modifies CI coverage. The previously raised httpoxy concern appears addressed by ignoring uppercase HTTP_PROXY when REQUEST_METHOD is present, and the previous Deno permission concern appears addressed by catching environment-read failures and adding tests. No hardcoded secrets were identified, and proxy URLs/credentials are deliberately not logged. However, proxy routing is security-sensitive and the PR also triggers CI/CD and dependency-change escalation paths, so human review is required. No specific escalation reviewers are configured.
Summary of Changes
Adds outbound proxy support using HTTP_PROXY/HTTPS_PROXY/NO_PROXY for @arcjet/transport and @arcjet/guard, with Node using HTTP/1.1 agent proxying when needed and Bun/Deno relying on native fetch proxy support. Also adds Deno exports/runtime tests, expands proxy test coverage, updates README documentation, modifies CI matrices, and raises Node engine requirements across the workspace.
Escalation Triggers
- CI/CD Pipeline: Changes GitHub Actions workflows to use Node 22, remove Node 20 from the test matrix, and add Bun/Deno runtime proxy test jobs.
- Dependency Changes: Many package.json files change engine requirements; @arcjet/transport adds internal workspace dependencies on @arcjet/env and @arcjet/logger and adds new runtime exports/scripts.
Review Focus Areas
- Confirm the httpoxy mitigation intentionally ignores only uppercase HTTP_PROXY for HTTP targets when REQUEST_METHOD is present, and that this behavior matches the duplicated guard implementation.
Proxy environment handling can affect outbound request routing; the previous security finding is addressed here but should be verified by a human because this is a trust-boundary-sensitive path. - Verify the Node.js built-in Agent proxyEnv behavior works as expected on the declared supported engine ranges, including HTTPS CONNECT behavior and the HTTP/2-to-HTTP/1.1 fallback when a proxy is detected.
The transport protocol changes under proxy configuration, and incorrect runtime assumptions could silently bypass proxies or break production connectivity. - Check that the Bun/Deno runtime detection and fallback to createFetchTransport cannot double-log, bypass NO_PROXY expectations, or choose the wrong transport under conditional exports.
@arcjet/guard has duplicated proxy detection and different runtime entry behavior, so drift or runtime-condition mistakes could create inconsistent behavior between packages. - Verify the lockfile is updated consistently for package.json dependency/engine changes, or confirm no lockfile changes are required for these workspace-only dependencies.
npm ci can fail or install a different graph if package.json and the lockfile are out of sync. - Confirm public docs/release notes clearly call out the breaking Node.js support change from Node 20 to Node 22+ and the narrower proxy-supported ranges for @arcjet/transport and @arcjet/guard.
This PR changes runtime support across many packages; users need clear migration guidance before release.
Notes
The diff appears to exceed the 1000-line automated-review threshold, so this is potentially too large for reliable automated review. Previous review concerns about uppercase HTTP_PROXY/httpoxy and Deno environment permission failures appear to have been addressed in the current diff.
Path filtering: 2 files excluded by ignore paths. 58 of 60 files included in review.
Review: 20395000 | Model: openai/gpt-5.5 | Powered by Arcjet Review
There was a problem hiding this comment.
I did a couple passes at a review with Claude, and then added a bunch of tests to ensure that we had end-to-end coverage for all of Node, Bun, and Deno. As such, I don't think it makes sense for me to approve this, so I'm passing the buck to @qw-in for final signoff. The diff is bulky, but most of the line count is tests, so all we really need is for Quinn to scan through the tests and verify that the tests do what they say they do.
One important note: to keep support simple, and because it's time, one of the commits I added to this PR is removal of support for the now EOL Node 20. @arcjet/transport has more specific requirements for which versions of Node 22 it requires due to its proxy support, but anybody tracking LTS should be fine. The commit message should trigger this getting flagged by release-please in any case, but we should be clear about the new Node version requirements in the release notes for the SDK.
The bot should not be setting "changes requested" when all it's doing is escalating to human reviewers (without any providing any actionable feedback).
|
I pulled the node update out into #6090 as there are a few other items I was aware of that could be tweaked. If that looks good I can rebase this on top later. In the meantime I'll run through this diff |
Auto-detect the standard proxy environment variables (HTTP_PROXY and HTTPS_PROXY, respecting NO_PROXY) so requests to the Arcjet API can be routed through a proxy such as Squid. The proxy URL is never logged since it can contain credentials; only a single info-level line is logged at startup when a proxy is in use. How the request is proxied depends on the runtime, using each runtime's built-in proxy support: - Node.js routes through the proxy over HTTP/1.1 using the Node.js HTTP agent's built-in proxy support, otherwise connects directly over HTTP/2. - Bun and Deno let the runtime's native fetch perform the proxying. On @arcjet/guard, Bun resolves to the HTTP/2 transport but its Node HTTP agent ignores the proxy option, so it falls back to the fetch transport when a proxy is detected. - Edge runtimes (edge-light, workerd) don't support outbound proxy environment variables, so no proxy is used. The NO_PROXY parsing is intentionally duplicated between the two packages (@arcjet/guard keeps an edge-safe, dependency-free copy); the shared logic is kept identical and annotated to stay in sync. The agent's built-in proxy support requires Node.js >=22.21.0 <23 || >=24.5.0, so the engines fields and @types/node are bumped accordingly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fix the type error that broke the build when packages bundle the transport source: the agent `proxyEnv` option is typed as `ProcessEnv`, which some augmentations (e.g. Next.js) make require `NODE_ENV`, so a bare object literal was rejected. Convert the single resolved proxy variable to `ProcessEnv` instead of pulling in the whole environment. Drop Node.js 20 from the test and examples workflows since the proxy support requires Node.js >=22.21.0; bump the remaining Node 20 setup steps to 22. Address review feedback: - httpoxy: ignore uppercase `HTTP_PROXY` for HTTP targets when `REQUEST_METHOD` is set (CGI), so an inbound `Proxy` header can't control outbound proxying. Applied identically to both detect-proxy copies. - Deno: reading proxy environment variables threw on runtimes that gate environment access (Deno without `--allow-env`); catch that and treat it as no proxy rather than failing transport creation. - Tests: assert the exact logged message rather than substrings, and build the test proxy's forwarded URL from the trusted origin. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`@arcjet/guard` had no `deno` export condition, so Deno resolved the `.` import via the `node` condition to the Node entry point. That builds a Node HTTP agent with the `proxyEnv` option, which Deno's Node compatibility layer does not implement (just like Bun), so a configured `HTTP_PROXY`/ `HTTPS_PROXY` was silently bypassed — the exact failure `@arcjet/transport`'s `deno` entry point was added to avoid. Add a `deno` export condition pointing at the fetch entry point (placed before `node` so Deno, which has both conditions active, matches it first), so Deno uses the fetch transport whose native `fetch` honors the proxy environment variables. As defense-in-depth for an explicit `@arcjet/guard/node` import on Deno, also fall back to the fetch transport when `isDeno()` is detected, mirroring the existing `isBun()` handling. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The proxy tests only exercised plaintext-HTTP absolute-form forwarding, but the production Arcjet API is HTTPS, which the Node agent reaches over an HTTP/1.1 `CONNECT` tunnel — a different code path that had no end-to-end coverage (the lone HTTPS test only asserted the transport's shape). This brings `transport/index.ts` to full branch coverage. Add test helpers: - `createConnectProxy` — a tunneling proxy that handles the `CONNECT` method, asserts the requested authority, and pipes bytes through. Its tunnel sockets are tracked so `close()` can destroy them; a `CONNECT` socket is detached from the server's connection tracking, so a keep-alive agent would otherwise hold the server open forever. - `generateSelfSignedCert` — a throwaway self-signed cert (via `openssl`, as the guard tests already do) for a real HTTPS origin. - `listen` now takes an optional protocol so it can return an `https` URL. The new test stands up an HTTPS origin and tunneling proxy and asserts the request is routed through the proxy via `CONNECT`. The agent doesn't expose a `ca` option, so the self-signed origin is trusted by disabling TLS verification for that test only; `NODE_TLS_REJECT_UNAUTHORIZED` is declared in `turbo.json` accordingly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On Bun and Deno the `bun.js`/`deno.js` entry points delegate proxying to the runtime's native `fetch`; the Node suite imports those entry points under Node, so it never verified that the runtimes actually honor the proxy environment variables — the part the PR notes is hard to test. Add runtime tests that run on real Bun and Deno: a shared fixture stands up an HTTPS Eliza origin reachable only through a `CONNECT` proxy, points `HTTPS_PROXY` at it (the production API is HTTPS, so this is the `CONNECT` path), builds the transport from the runtime entry point, and asserts the request was tunneled through the proxy. The fixture reuses the `createConnectProxy`/`generateSelfSignedCert` helpers added for the Node HTTPS test. - `test-runtime-bun` / `test-runtime-deno` npm scripts run them. - A `transport-runtime` CI matrix runs them on Bun (1.3.0 and latest) and Deno (lts and latest). Bun is pinned to 1.3.0+ because that is where its native-fetch proxy support was verified (the shared bun-test job's 1.2.19 predates it). - `tsconfig.json` excludes `test/runtime/**` from the Node build, since those files use runtime-specific globals (`bun:test`, `Deno`). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The proxy detection reads both the uppercase and lowercase forms (`http_proxy`/`https_proxy`/`no_proxy`), but only the uppercase variants were declared in `globalEnv`. Declaring the lowercase forms keeps Turbo's cache key correct if task caching is ever enabled and stops them being dropped under strict env modes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nreadable `detectProxy` reads `ARCJET_LOG_LEVEL` to build the default logger after a proxy is resolved. That read sat outside the try/catch that guards proxy detection, so on a permission-gated runtime (e.g. Deno without `--allow-env`) where a proxy came from an explicit `proxyEnv` but the ambient environment is unreadable, it would throw and fail transport creation — contrary to the function's "treat env errors as no proxy" intent. Guard the default-logger construction and skip the startup line instead; the resolved proxy is still returned. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`detectProxy` parsed `new URL(baseUrl)` inside the try/catch that guards environment access, so an invalid `baseUrl` was silently turned into "no proxy" rather than throwing. `@arcjet/transport` parses the URL up front and relies on it throwing (its "should throw w/o url" test). Parse the URL before the try in the guard copy too, so the two behave consistently and a malformed URL fails fast; only environment access stays recoverable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`isNoProxy` matches entries as host names; IP/CIDR ranges (e.g. `10.0.0.0/8`) are not handled. Since the Arcjet API is addressed by a host name this never applies in practice, and it matches curl, which also doesn't support CIDR in `NO_PROXY`. Rather than add IP-range matching to a security-sensitive parser that is duplicated across two packages, document the supported syntax and the limitation in both READMEs and both parser copies, and note that on Bun and Deno the runtime's own `fetch` applies NO_PROXY. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The guard copy logged "Connecting to the Arcjet API through a proxy" unconditionally via `console.info`, so it always printed to stdout with no way to silence it — diverging from `@arcjet/transport`, which logs the same line through `@arcjet/logger` at `info` level (hidden by default, since the default level is `warn`). Gate the line on `ARCJET_LOG_LEVEL` (`info` or `debug`) so the two packages behave the same and the line can be silenced. The level is read from the same environment the proxy was resolved from, keeping this copy edge-safe with no imports. Note: the line is now hidden by default; set `ARCJET_LOG_LEVEL=info` to see it, as with `@arcjet/transport`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The transport `createTransport` tests called the factory without clearing the proxy environment, so a developer or CI runner with `HTTPS_PROXY` set would silently push the no-proxy cases onto the proxy branch (and could leak a stray startup log) — host-dependent, non-deterministic tests. Clear the standard proxy variables in `beforeEach` and restore them in `afterEach` so these cases always exercise the intended path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The transport tests only exercised the `https.Agent` branch, leaving the `http.Agent` branch (HTTP target) uncovered. Add a case that detects `HTTP_PROXY` for an `http` target, bringing `transport-node.ts` to full line and branch coverage. Also simplifies the existing HTTPS case now that proxy-environment isolation is handled by the suite's `beforeEach`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The proxy-resolution logic is duplicated between `@arcjet/transport` and `@arcjet/guard` (the guard copy stays edge-safe with no imports), kept in sync only by a comment. Add a test that compares the shared helpers (`proxyForUrl`, `isNoProxy`, `firstValue`) across both source files — ignoring comments and formatting — so a logical change to one copy that isn't mirrored in the other fails CI. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The agent's `proxyEnv` option is typed as `ProcessEnv`, whose index signature accepts any string key — so a misspelled proxy variable name (e.g. `HTTPSPROXY`) would compile and silently disable proxying. In `@arcjet/transport` this was compounded by an `as unknown as ProcessEnv` cast that erased all checking. Build the literal through an explicit `Partial<Record<"HTTP_PROXY" | "HTTPS_PROXY", string>>` type in both packages so a misspelled key is a compile error. `@arcjet/transport` still asserts to `ProcessEnv` afterwards because some augmentations (e.g. Next.js) make it require `NODE_ENV`; `@arcjet/guard` needs no cast. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ling TLS The HTTPS proxy tests set `NODE_TLS_REJECT_UNAUTHORIZED=0` to trust a self-signed origin, which CodeQL (rightly) flags as disabling certificate validation. Drop it: the agent (and native `fetch`) sends a `CONNECT` to the proxy for an HTTPS target *before* the TLS handshake, so routing is verified by asserting the proxy received the `CONNECT`. The handshake over the tunnel is then expected to fail on the untrusted cert, which the test swallows. Applies to the Node HTTPS test and the Bun/Deno runtime tests; removes the Deno `--unsafely-ignore-certificate-errors` flag and the now-unused `NODE_TLS_REJECT_UNAUTHORIZED` from `turbo.json`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two failures in the new `transport-runtime` matrix: - Bun 1.3.0 doesn't honor the proxy environment variables in `fetch` (the request went direct), so the proxy assertion failed. Bump the pinned version to 1.3.14, where it was verified to work. - `setup-deno` resolving `lts`/`latest` fetches from `dl.deno.land`, which the egress allowlist blocked. Add `dl.deno.land:443`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The test CONNECT proxy used `http.createServer().on("connect")`, but some
runtimes' Node compatibility layers don't emit the `connect` event — Deno
LTS (2.5.7) never fired it, so the proxy saw no CONNECT and the Deno lts
runtime job failed (`connectCount` was 0). Reimplement the proxy with a raw
`node:net` server that parses the `CONNECT` line itself, which works on
Node, Bun, and Deno (2.5.7 and latest). Since `net.Server` has no
`closeAllConnections()`, track accepted sockets and destroy them on close.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The runtime fixture set `HTTPS_PROXY` after the process had already started. That doesn't match production — a proxy is configured via a plain environment variable set before launch — and Bun (and older Deno) only read the proxy env at startup, so the runtime-set value was ignored, making the test appear to require Bun >= 1.3.14. In fact Bun honors a startup-set `HTTPS_PROXY` for `fetch` back to at least 1.2.0 (verified end-to-end). Have the `test-runtime-*` npm scripts export `HTTPS_PROXY` before launching the runtime (the proxy binds that fixed port; the origin uses a random one), so the test exercises the production path. Lower the Bun matrix floor from 1.3.14 to 1.3.0, the project's minimum supported Bun. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`createTransport` parsed `baseUrl` with `new URL` and then `detectProxy` parsed it again (and `@arcjet/guard`'s node entry parsed it a third time for the agent's protocol check). Change `detectProxy` to accept an already-parsed `URL` so each entry point parses once and reuses it. The invalid-URL throw now happens where the caller constructs the `URL` (still up front, so it surfaces rather than being swallowed). Purely a one-time micro-optimization at client construction; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Apply /simplify cleanups, favoring explicit code over special cases: - Flatten `@arcjet/guard`'s node `createTransport` into three clearly labelled branches (no proxy -> HTTP/2; proxy on Bun/Deno -> fetch; proxy on Node -> agent) instead of a Bun/Deno special case buried inside the proxy block, and compute `url.protocol === "https:"` once in both it and `@arcjet/transport`'s `index.ts`. - Break the dense `isNoProxy` loop into named `parseNoProxyEntry` and `hostMatches` steps in both detect-proxy copies (kept in sync; behavior unchanged, verified by the NO_PROXY table and the drift test). Consolidate tests without losing coverage: - Drive the four near-identical "works over HTTP on <runtime>" transport tests from a table over a shared `withHttpOrigin` helper. - Extract the duplicated proxy-env isolation and the Bun/Deno runtime-simulation boilerplate into `arcjet-guard/test/_shared/proxy-env.ts`, and loop the two fetch-fallback cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
79a4fc2 to
3a374f2
Compare
…22.x #6090 standardized the monorepo on `@types/node` 22.x (to type-check against the minimum supported Node). The Node proxy path passes `proxyEnv` to the HTTP agent, an option only declared in `@types/node` 24.x — and transport's source is re-type-checked by every package that bundles it (e.g. @arcjet/sveltekit, @arcjet/next), so pinning the proxy packages to 24.x both diverges from the standard and breaks those consumers' builds under 22.x. Instead, keep `@types/node` at 22.x everywhere and add `proxyEnv` through an intersection type (`AgentOptions & { proxyEnv: ... }`) so it type-checks on both lines. Misspelled keys are still caught via the precise key type. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3a374f2 to
2e91b25
Compare
|
@qw-in this has now been rebased and had its version of the Node 20 drop dropped. |
|
Thanks @arcjet-rei! Still going through right now |
There was a problem hiding this comment.
I think my main concern with this approach is we are being automagic ✨. I'd prefer to make this an explicitly configurable option. Partially thats how I prefer to design apis but I also worry about breaking existing customers who may have HTTP_PROXY environment variables.
The current approach also doesn't (to my eye) allow opting arcjet out of the environment's proxy config without manually updating the environment variables.
So, Id propose we put it on the SDK initialization:
// requests
const aj = arcjet({
// Place in transport to differentiate from "proxy"
// top level configuration object
transport: {
// ...
proxy: /* ... */
}
// ...
});
// guard
const arcjet = launchArcjet({
transport: {
// ...
proxy: /* ... */
}
});This has the side benefit of giving us somewhere to explain the http proxy in tsdoc.
Note
We will want to be careful not to confuse users with "proxy" as we already have a a few difference references to "proxy"
My other concern is right now configuring a proxy will degrade the underlying connection from http2 to http1.1 which (to my understanding) has real latency implications. I think we should model the configuration to make this clear. Maybe something like this? (Just spitballing)
type ArcjetTransportConfiguration =
| {
protocol: "http2";
/** Not supported over HTTP/2 — see {@link …} */
proxy?: never;
}
| {
protocol: "http1.1";
proxy?: HttpProxyConfig;
};We can still have automatic proxy support but I'd suggest making it opt-in:
// requests
import { arcjet } from "@arcjet/node";
// export on the base package or could be transport directly
// unrelated conversation but I'd like us to move to this
// pattern instead of multiple packages one day
import { transportFromEnvironment } from "arcjet/transport"
const aj = arcjet({
transport: transportFromEnvironment(),
// or maybe
transport: {
protocol: "http1.1",
proxy: proxyFromEnvironment(process.env),
}
});
// guards
import { arcjet } from "@arcjet/guard";
import { transportFromEnvironment } from "@arcjet/guard/transport"
const arcjet = launchArcjet({
transport: transportFromEnvironment(),
});I think this means we can scope down where we are touching the environment variables (which lots of bundlers/runtimes seem to trip over in my experience)
The reason I didn't raise an eyebrow at the transparent / environment variable approach to proxying is that that's basically the way I've always done it, all the way back to Perl's CGI.pm thousands of years ago. It's a very common approach in pretty much every ecosystem except Node, which chooses to be more explicit about it (FWIW, we built transparent / automatic proxy support into node-newrelic because that was the convention we followed at the time). But I think Node as a whole sides with your philosophy. I'm fine with making this explicit and opt-in, or at least making this a judgment call for @davidmytton. Removing the environment variable code would certainly simplify the PR.
We don't necessarily have to pay that tax for Node (the story is muddier for Bun and Deno) -- because Squid doesn't mess with the traffic, if we do an HTTP/2 CONNECT, that should work. There are some nuances (if the underlying socket isn't created correctly by the HTTP/2 client and/or Squid, there's packet overhead on sequential requests over the HTTP/2 socket that amounts to a ~40ms/req tax. We should make HTTP2 configurable, use sensible defaults to avoid incurring the Nagle tax unnecessarily, and make sure that we have clear documentation about how egress proxying will need to be configured to keep request overhead low. I'll add a commit with HTTP/2 support for Node. I've gone ahead and added a commit to the PR that adds HTTP/2 support for Node proxying. Both Bun and Deno treat HTTP/2 via their fetch implementations as either immature or experimental, and the Bun implementation has at least one significant outstanding bug. Deno and Bun users are outta luck for now, unless they want to use a sidecar for egress that transparently proxies Arcjet-bound traffic instead of configuring proxy support in-app (which is probably what most Fargate or EKS / GKE users are doing without even knowing it). |
Proxying on Node.js previously always downgraded HTTP/2 to HTTP/1.1, because Node's built-in agent proxy support only works over HTTP/1.1. For a latency-sensitive API that gives up HTTP/2 multiplexing, so a burst of concurrent requests opens a new proxy connection each instead of sharing one. Add a `proxyHttpVersion` transport option. The default (`"1.1"`) keeps the existing agent-based behavior. Setting `"2"` opens an HTTP `CONNECT` tunnel and performs the TLS handshake — and the ALPN negotiation that selects `h2` — directly with the origin, so the proxy only blindly forwards the tunnel and cannot downgrade the protocol. This slots into connect-node's stock Http2SessionManager via `nodeOptions.createConnection` with no fork, so pings and the idle timeout keep working. The tunnel helper disables Nagle's algorithm on its socket; without that, the interaction with delayed ACKs adds ~40ms per round trip. Documented alongside the caveats that this is Node-only, requires a tunneling (non-TLS-terminating) proxy, and that the proxy itself must not buffer the tunnel. Tests cover a full h2c round trip through a CONNECT proxy and an end-to-end ALPN=h2 negotiation over TLS through the tunnel. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
adaeb63 to
218aa13
Compare
|
It's standard to use HTTP_PROXY, HTTPS_PROXY, and NO_PROXY vars and Node supports this as documented with node:http and node:https (v22.21.0 or v24.5.0+) methods as well as fetch() (v22.21.0 or v24.0.0+). We should follow the same approach. |
…e sniffing The Node transport detected Bun and Deno at runtime (`"Bun" in globalThis` / `"Deno" in globalThis`) to fall back to the fetch transport, because those runtimes' `node:http` agents ignore the `proxyEnv` option. Reviewers asked to avoid switching on the runtime in code and to make an explicit `@arcjet/guard/node` import behave like Node rather than silently using fetch. Encode each runtime's strategy in its own entry point, selected statically by the package `exports` conditions instead of `globalThis` checks: - New `bun` entry (`transport-bun.ts`): HTTP/2 directly (Bun's `fetch` has no HTTP/2) and the fetch transport when a proxy is detected (Bun's native `fetch` proxies). The `bun` export condition now points here. - `transport-node.ts` is pure Node again: HTTP/2 directly, or the HTTP agent with `proxyEnv` when a proxy is detected. No runtime checks. - Shared the direct HTTP/2 builder in `transport-http2.ts` so Node and Bun don't duplicate it. An explicit `@arcjet/guard/node` import now always uses the Node transport, as a user importing `/node` would expect. The only branch left in each entry is on proxy presence, which the code already computed. Removes the now-dead `withSimulatedRuntime` test helper. Also syncs the guard lockfile to the `@types/node` 22.x already pinned in package.json (it was stale at 24.x). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow-up cleanups to the HTTP/2-through-proxy work, no behavior change: - Factor the duplicated Http2SessionManager + pre-connect + createConnectTransport block (the direct path and the new proxyHttpVersion: "2" path) into a single createHttp2Transport() helper that takes an optional createConnection. Removes the second copy of the AWS idle-timeout comment and keeps the two paths from drifting. - proxy-tunnel.ts: pass the stream write callback straight through instead of wrapping it in `() => callback()`, and release the buffered CONNECT response head after the tunnel splices so it isn't retained for the connection's life. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds outbound proxy support to
@arcjet/transportand@arcjet/guard. The standard proxy environment variables (HTTP_PROXYandHTTPS_PROXY, respectingNO_PROXY) are auto-detected so requests to the Arcjet API can be routed through a proxy such as Squid.How proxying works per runtime
Each runtime uses its own built-in proxy support:
fetchperforms the proxying. On@arcjet/guard, Bun resolves to the HTTP/2 transport, but its Node HTTP agent ignores the proxy option, so it falls back to the fetch transport when a proxy is detected (verified on Bun 1.3.14).edge-light,workerd) — don't support outbound proxy environment variables, so no proxy is used.Notes
NO_PROXYparsing is intentionally duplicated between the two packages (@arcjet/guardkeeps an edge-safe, dependency-free copy). The shared logic is kept logically identical and annotated with a "keep in sync" comment in both files.>=22.21.0 <23 || >=24.5.0, so theenginesfields and@types/nodeare bumped accordingly.infolevel. In@arcjet/transport(via@arcjet/logger, default levelwarn) it requiresARCJET_LOG_LEVEL=infoto be visible; in@arcjet/guard(viaconsole.info) it always shows.Closes #6082