feat: per-request HTTP middleware via WebSocket#37
Merged
Conversation
Add support for an external middleware service that can inspect, block, or rewrite HTTP requests and responses in real time over a persistent WebSocket connection with JSON protocol. New middleware package (internal/greyproxy/middleware/): - types.go: wire message types (Hello, HookSpec, Decision, etc.) - client.go: WebSocket client with multiplexing, reconnect, hello exchange - filter.go: filter evaluation (glob, regex, exact match) with compiled cache New hook points (internal/gostx/): - proxy_hook.go: GlobalProxyRequestHook, GlobalProxyResponseHook (plain HTTP) - mitm_hook.go: GlobalMitmRequestMiddlewareHook (Step 1.5), GlobalMitmResponseHook (Step 4a) for the MITM pipeline Call sites: - handler.go: request hook before RoundTrip, response hook after - sniffer.go: request hook after hold hook / before credential substituter, response hook before writing to client / before round-trip hook Wire-up: - main.go: --middleware CLI flag with http->ws URL normalization - program.go: config loading, client startup, all 4 hook registrations - config.go: MiddlewareConfig struct Includes docs/middleware.md and 6 Python example middleware (uv run): passthrough, command stripper, PII redactor, secret scanner, cost tracker, audit log.
The h2Handler.ServeHTTP (HTTP/2 code path) was missing middleware request and response hooks. When clients connect via SOCKS5 to HTTPS endpoints, the decrypted stream uses HTTP/2 and routes through h2Handler, which only had the observability hook but not the middleware hooks (GlobalMitmRequestMiddlewareHook, GlobalMitmResponseHook). This meant WebSocket middleware never received http-response messages for HTTP/2 traffic.
When the middleware rewrites a response body (e.g., stripping dangerous commands), the new body is uncompressed plaintext. If the original response had Content-Encoding (zstd, gzip, etc.), the header was preserved, causing the client to fail decompression on the uncompressed body. Now Content-Encoding and Transfer-Encoding are removed on rewrite.
The proxy now decompresses gzip/deflate/zstd response bodies before sending them to the middleware WebSocket. Without this, compressed responses (common with HTTP/2 clients like Node.js) were sent as raw bytes that the middleware couldn't inspect, so pattern matching (e.g., dangerous command stripping) silently failed on compressed content.
Lets a middleware subscribe to LLM traffic with {"llm": true} in its
hello filter instead of duplicating greyproxy's host/path→decoder map.
The proxy evaluates EndpointRegistry.Match() on every hook invocation
(no caching, so a rule toggled in the UI takes effect on the next
request) and passes isLLM into MatchesFilter. Disabled registry rules
naturally do not match because Match() returns "" for them.
nil = no LLM gating
true = only requests the endpoint registry resolves to a decoder
false = only non-LLM traffic
--middleware is now repeatable and the YAML key becomes `middlewares:` (list). Multiple middlewares run sequentially: each sees the previous one's (possibly rewritten) output as its input, and deny/block short-circuits the chain. Per hook type the wiring builds an ordered list of (client, filters) pairs, and each global hook iterates it: - allow/passthrough continues the cascade - rewrite mutates working state so the next step sees the new version - deny/block stops the chain and returns immediately Request-side cascades mutate req in place (plain HTTP and MITM). Response-side cascades track a working (status, headers, body) tuple and flush it back via the returned decision, since the MITM response hook receives its info struct by value and the plain HTTP response path applies rewrites through the decision struct. Decision gains a Tags map[string]any field (structlog-style) so a middleware can emit per-request metadata on any action. Tags are preserved per middleware with no cross-middleware merging, which matters for the upcoming Activity integration.
Non-trivial middleware decisions (deny, block, rewrite, or silent allow/passthrough carrying structured tags) from the MITM response cascade now surface in the Activity view. Scope is intentionally limited to MITM response for this first pass; request-side denies and the plain-HTTP path can follow. Correlation is the key challenge: the middleware response hook and the round-trip persistence hook both fire for the same request but are separate callbacks with no shared key. A short request id is now generated once at the top of the sniffer's httpRoundTrip() and threaded through HTTPRoundTripInfo → MitmRoundTripInfo so both hooks see the same id. The middleware cascade stashes row-worthy events in a per-process map keyed by request id; the persistence hook drains and writes them to middleware_events with the freshly-created transaction id. A TTL sweeper reaps orphan buckets if a request never reaches the persistence hook. Schema (migration 13): middleware_events with composite (transaction_kind, transaction_id) index for the cheap join. Only mutating actions or tag-emitting decisions produce a row, so silent middlewares stay invisible and don't bloat the table. Activity rendering: QueryActivity does one extra query per page to load events for the fetched rows and attach them to ActivityItem; the activity table shows per-event badges next to the URL and lists them grouped by middleware in the detail panel with diff summaries, durations, and raw tags.
Plain-HTTP traffic (non-TLS upstreams, local dev servers, HTTP-only LLM endpoints) was invisible in Activity: only the TCP connection rows were logged and nothing ever populated http_transactions for this path. The middleware cascade already ran on these requests and could deny or rewrite them, but since no transaction row existed, the new Phase C Activity integration had nothing to attach middleware events to. This adds the symmetric piece: - New ProxyRoundTripInfo + GlobalProxyRoundTripHook in gostx, fired after the response has been written back to the client. Body capture is gated on the hook being set, so the default path has no new overhead when nobody consumes it. - The plain-HTTP handler generates a RequestID once at the top of proxyRoundTrip() and stores it in ctx via gostx.WithRequestID. Both the middleware cascade hooks and the round-trip hook read it from ctx to correlate decisions with the transaction row. - The plain-HTTP request and response middleware cascades in program.go now stash middleware_events (deny/block/rewrite and tagged allow/passthrough) under the RequestID, matching the MITM response cascade behavior. - program.go installs GlobalProxyRoundTripHook unconditionally. It calls CreateHttpTransaction with the captured request/response data, drains any stashed middleware events and writes them, then publishes EventTransactionNew so the Activity live feed updates. Net effect: a plain HTTP request through greyproxy now produces an http_transactions row, shows up in Activity with method/url/status/ duration, and surfaces middleware event badges identically to a MITM-intercepted HTTPS request.
A middleware can now return an optional "name" field in its hello
response. The proxy stores it alongside the URL and displays it in
the Activity view instead of the raw ws:// URL, which was unreadable
once more than one middleware was in the cascade.
- middleware.HelloMsg gains Name; the Client stores it and exposes
Name() alongside HookSpecs() / MaxBodyBytes().
- program.go captures it into clientHook and threads it through every
stash site (plain HTTP request/response + MITM response cascades).
- Migration 14 adds middleware_events.middleware_name as a nullable
column. WriteMiddlewareEvent writes it; LoadMiddlewareEventsForActivity
reads it.
- MiddlewareEventSummary.DisplayLabel() prefers the name and falls back
to the URL, so middlewares that didn't upgrade still render sensibly.
- Activity UI shows the friendly label in the row badge ("rtk-compress:
rewrite") and in the detail panel, keeping the URL as a tooltip and
as parenthetical text next to the name.
- rtk-compress example declares name: "rtk-compress".
- docs/middleware.md describes the field as optional-but-recommended.
Follows up the middleware-name work. The activity row badge now shows
the friendly name declared in the middleware's hello (or the URL as
fallback), so a cascade with multiple middlewares reads as
"✎ rtk-compress" rather than "rewrite". The action type is encoded
redundantly via both the background color and a small unicode glyph:
✗ red -- deny / block
✎ amber -- rewrite
♯ blue -- tagged-allow / tagged-passthrough
The action text moves into the tooltip ("rewrite by rtk-compress
(ws://localhost:9000/middleware)") where it is still discoverable but
does not crowd the row.
User feedback: the ws:// URL is noise in the UI. The badge tooltip no longer includes it, and the detail panel no longer appends it in parentheses after the name. The middleware URL is still stored in middleware_events.middleware_url for provenance (API / debugging), but the Activity view shows only the friendly name. Events whose middleware did not declare a name still fall back to the URL as the display label (via MiddlewareEventSummary.DisplayLabel), so anonymous middlewares remain identifiable.
The destination td had a blanket `truncate` class: overflow hidden, single line, ellipsis. Fine when the td contained only the URL, but the middleware event badges appended after the URL were getting clipped by overflow:hidden whenever the URL pushed the row width to the column edge. Restructure the HTTP branch as a flex container: credentials icon (shrink-0) + URL (truncate, min-w-0) + middleware badges (shrink-0). The URL shrinks with ellipsis, the badges stay visible on the right. Connection branch gets its own inner truncate div so the existing host text keeps its truncation behavior unchanged.
Two related gaps surfaced once the rtk-compress middleware was run against real Claude Code HTTPS traffic to api.anthropic.com: 1. The MITM request middleware hook rewrote bodies successfully but never stashed middleware_events rows, so no badge ever showed in Activity for HTTPS traffic. Plain HTTP (request + response) and MITM response cascades stashed; only MITM request didn't. 2. The MITM path only carried the RequestID inside HTTPRoundTripInfo, not via ctx. The MITM request hook receives `(ctx, req, container)` with no info struct, so it had no way to read the id that Phase C assumes is the correlation key. Both are fixed by: - Moving NewRequestID / WithRequestID / RequestIDFromContext down into the sniffing package so both the sniffer (generator) and gostx (consumer) share the same unexported ctx key. gostx/proxy_hook.go becomes a thin wrapper that delegates to sniffing. The handler/http and cmd/greyproxy call sites keep using gostx.* unchanged. - Sniffer's httpRoundTrip() now also writes the id to ctx immediately after generating it, so every downstream hook (hold, middleware request, middleware response, round-trip persist) sees the same id via RequestIDFromContext. - MITM request cascade in program.go now stashes events for deny / rewrite / tagged-allow / tagged-passthrough under the RequestID, matching the three other cascades. The MITM response persist hook already drains pending events after CreateHttpTransaction, so these new events land on the same transaction row. - The rtk-compress example middleware was also updated as part of the same investigation: LOG_CMD narrowed back to commands that produce severity-tagged output (tail/journalctl/dmesg/less/more/*.log), pick_mode returns None again for unknown shapes, and rtk_compress special-cases `rtk log` to omit the `-` arg (which rtk treats as a literal filename for the log subcommand, unlike json/diff).
Documents the rtk tool-output compressor example and ships a reproducible test setup (fake Anthropic server + client) that measures the before/after byte delta to prove the rewrite path is wired up end-to-end.
main.go had a struct-field misalignment that failed gofmt; the feedback file was committed to this branch by mistake (it concerns an unrelated Greywall project).
…aders Addresses a cluster of correctness, security, and simplicity issues found during PR review. Each one individually was small; together they change the semantics operators should rely on, so the doc updates are part of the same commit. client.go - Hello type validation returned a nil err on mismatch (`return err` after a successful ReadJSON), so a server replying with a wrong type silently succeeded. Now returns a real error and the connection is dropped. - The read loop previously killed the entire connection on any JSON unmarshal error, which drained every in-flight request to a default decision. Now malformed frames are logged and skipped; only transport errors trigger reconnect. - Send() held the client-wide mutex across the WebSocket write, so a slow peer stalled reads of pending/hooks. Writes now use a dedicated writeMu. - Pending entries track whether they were a request or response, so drainPending() returns the correct default action (block/passthrough for response, deny/allow for request) instead of always emitting a request-shaped deny. - Decision gains a Fallback field (json:"-") carrying the reason when the Decision was synthesised locally. Cascades log this at warn so operators can distinguish "middleware allowed" from "middleware was down". Fail-closed default - OnDisconnect now defaults to "deny" rather than "allow". A policy middleware (secret scanner, PII redactor) that crashes or is unreachable should not let traffic flow through silently. Advisory-only middleware (audit, cost tracker) must set on_disconnect: allow explicitly, which the docs now frame as a deliberate opt-in. Rewrite header denylist - A middleware's `rewrite` decision previously merged into req.Header and resp.Header with no filter, so a compromised middleware could overwrite Authorization, Cookie, or Host. MergeRewriteHeaders strips hop-by-hop headers (RFC 7230 §6.1) and credential/identity headers before applying. Rejected keys are logged. - Response rewrites also drop Content-Encoding when a fresh body is supplied, so the next cascade step doesn't try to gunzip plaintext. Unknown actions - IsKnownAction is checked on every decision; unknown actions still fall through to allow/passthrough (safest default for forward compatibility) but now emit a warn log naming the middleware and action. One typo shouldn't silently bypass policy without a trace. Response hook had request_body always empty - The plain-HTTP response cascade read RequestBodyFromContext(ctx), but no code ever called WithRequestBody, so ResponseMsg.RequestBody was always nil. ProxyRequestHook now returns (ctx, decision) and the request cascade stashes the captured body on ctx so the response cascade can include it. Filter cache leak - filterCache was a global `map[*HookFilter]*compiledFilter` keyed by pointer identity. On reconnect the hello response produced a fresh HookFilter pointer, so the cache grew indefinitely. Compiled regexes now live on the HookFilter itself behind a sync.Once and are GC'd with the filter. Refactor - The four near-duplicate cascades in program.go (~400 lines) are now one runRequestCascade + one runResponseCascade. The transport-specific hook entry points (plain HTTP / MITM, request / response) translate to the neutral cascade result type, so a fix in the iteration logic applies to all four paths at once.
All six non-rtk examples now send a "name" field in their hello response. Greyproxy uses this for activity badges; without it, the rows show the full ws:// URL instead, which is noisy. rtk already had this.
New tests lock in the guarantees the PR review surfaced: - Hello type validation: a server replying with the wrong type must not mark the client ready. - Fallback actions: request-hook timeout returns deny (default) or allow (opt-in); response-hook timeout returns block (default) or passthrough (opt-in); Fallback reason is set so callers can log it. - Drain on disconnect: an in-flight ResponseMsg Send gets a response-shaped default (block) not a request-shaped one (deny). - Malformed frame: a garbage WebSocket frame is skipped without dropping the connection; a later valid decision still reaches the waiting Send. - Header denylist: MergeRewriteHeaders refuses Authorization, Cookie, Host, Set-Cookie, and hop-by-hop headers (case-insensitively) while applying safe headers. This is the security-critical regression guard against a compromised middleware escalating credentials. - Filter match semantics: host glob with leading *. wildcard, path regex compiled-once caching, LLM gate, and content-type parameter stripping. - NewID uniqueness and hex shape. - ActionForTimeoutKind / IsKnownAction / BodyChanged helpers. main_test.go installs a no-op logger so the cascade fallback paths don't nil-panic under `go test` (the binary installs a real logger via logger.SetDefault; tests have to do it themselves).
The old backoff went 100ms → 10s doubling and never reset across the outer
for loop. Once the cap was reached (after ~7 disconnects), every subsequent
restart-reconnect-restart cycle sat at 10s. Middleware development flows
(auto-reload, container restart) suffered the most.
Three changes:
- Cap lowered from 10s to 2s. An LLM request at default timeout_ms=2s
doesn't benefit from a longer reconnect window; the request has already
fallen back by then.
- Backoff resets to the initial 100ms when the previous connection was
up for at least 5 seconds ("healthy"). A working middleware that
restarts now reconnects within a few hundred ms, not seconds.
- Added ±20% jitter so multiple greyproxy instances (or multiple
middlewares behind the same outage) don't reconnect in lockstep.
Docs: clarify the three distinct timeouts (hello 5s, per-message timeout_ms,
reconnect backoff) — the old doc only mentioned the hello one in passing.
Test added: asserts jitter stays inside the ±20% envelope.
2 seconds is too tight once the middleware is non-trivial. Real policy middlewares regularly offload their decision to another LLM (PII classification, prompt-injection detection, policy evaluation on model output), or to a slow local scanner. 2s was an artefact of treating the middleware as a pure-regex predicate; the protocol supports more than that. 10s gives LLM-offloaded middlewares a realistic budget without feeling unbounded. Operators whose middleware is purely local can drop `timeout_ms` as low as they like in YAML — the docs now flag this as the shape of good config. CLI-only middlewares take the default. The test TestClient_DefaultTimeoutGenerous pins the default so a future revert has to touch the test too. Config.TimeoutMs default is now resolved inside middleware.New rather than duplicated in buildMiddlewareConfigs. Single source of truth; YAML values still override when present.
Option B from the review: middlewares declare a [min_version, max_version] range in their hello response, the proxy picks the highest integer in the overlap of that range and [1, ProtocolVersion]. No overlap refuses the connection with a readable error naming both ranges. Omitting both bounds is equivalent to declaring [1,1], so every example middleware already in the repo keeps working without any wire change. Why bother now while ProtocolVersion is still 1: - The mechanism has to exist *before* we bump. If we ship v2 without negotiation, every middleware in the wild silently sees the wrong shape until its author reads a changelog. With negotiation, a v2 proxy connecting to a v1-only middleware picks v1 (if v1 is still supported) or refuses the connection with a clear message (if v1 has been retired) rather than hanging on a field the middleware never filled in. - Agreed version is logged at connect, so operators can see which version each middleware negotiated without guessing from code. Examples updated to declare min/max explicitly — acts as documentation and as a pin against a future proxy that retires v1. Tests cover the full matrix plus the backwards-compat path (omitted bounds) and the refused-connection path (middleware requires v>proxy).
Settings page now has a "Middlewares" tab listing every configured middleware with a live connection state badge, the URL, the negotiated protocol version, its declared hooks, and the effective timeout_ms / on_disconnect policy. Read-only: middleware configuration is owned by CLI flags and greyproxy.yml, not the runtime store, so this page does not offer mutation. The UI hits GET /api/middlewares which returns the current snapshot. Plumbing: - middleware.Client gets URL() / TimeoutMs() / OnDisconnect() / IsConnected() getters. IsConnected reads c.conn!=nil under the mu lock so the flag tracks reconnects without an event bus. - greyproxy.MiddlewareStatus is the wire/struct shape for the API. - api.Shared gains a MiddlewareStatusesFn closure field; the api package stays free of any middleware-package import (no cycle). - cmd/greyproxy sets the closure after creating the clients. Each call to the handler runs the closure fresh, so a middleware that goes down surfaces immediately without UI state drifting. - A "Refresh" button reloads on demand; switching to the tab also triggers a load. Smoke-tested end to end: connected state flips from true to false when the upstream middleware is killed, without restarting greyproxy.
Operators can now point greyproxy at a command instead of a URL:
greyproxy serve --middleware-cmd 'uv run ./mw.py'
Greyproxy spawns the child, owns its lifecycle, and talks NDJSON on
stdin/stdout. Reconnection and fallback decisions are identical to the
WebSocket path — the child crashing triggers the same exp-backoff
respawn that a WS disconnect does. Same wire protocol, same hello
exchange, same version negotiation, same header denylist, same
per-message timeout. The only difference is framing.
Rationale: every existing example middleware ships its own WebSocket
server boilerplate (~30 lines), makes the operator manage a port, and
requires two terminals. For local single-host deployments that's
friction with no upside. The stdio path matches how MCP servers are
typically launched and reduces "start my middleware" to one flag.
Transport layout:
- internal/greyproxy/middleware/transport.go introduces a Transport
interface (WriteMessage / ReadMessage / Close) and two
implementations: wsTransport (gorilla, extracted from the previous
inline code, no logic change) and stdioTransport (exec.CommandContext,
bufio.Scanner on stdout, bounded stderr forwarder into the logger).
- Client is now transport-agnostic: New() picks the dialer based on
whether Config.URL or Config.Command is set, and the rest of the
client (hello, pending map, Send, drain, fallback) doesn't care.
- stdioTransport.Close closes stdin first (so a well-behaved child
exits on EOF), waits stdioCloseGrace (2s), then SIGKILLs. Prevents
zombies when the proxy exits.
- Hello timeout now works on any transport: readMessageWithTimeout
runs ReadMessage in a goroutine and closes the transport to unblock
it on timeout. Previously we relied on WS-specific SetReadDeadline.
Config + CLI:
- Config gains Command []string and Name string. Exactly one of URL or
Command must be set; YAML entries with both are skipped with a
warning.
- splitCommand parses --middleware-cmd with shell-like rules (quotes,
backslash escapes) but never invokes a shell. Operators who need
shell features pass "sh -c '...'" explicitly.
- MiddlewareStatus gains Kind ("ws" | "stdio") so the UI can
distinguish the two in the Middlewares tab.
Tests:
- TestSplitCommand covers quoting, escapes, leading/trailing
whitespace, unterminated quotes, empty input.
- TestStdioTransport_HelloAndDecision re-execs the test binary as a
fake middleware (main_test.go reads GREYPROXY_FAKE_MW and acts
accordingly) so the full spawn→hello→request→decision→close cycle
runs without depending on Python or a separate fixture binary.
- TestStdioTransport_ChildExit_TriggersReadError pins the "middleware
died mid-conversation → ReadMessage returns error" behaviour that
triggers the client's reconnect loop.
- TestStdioTransport_CloseKillsChild asserts the SIGKILL path fires
within the grace window.
Smoke-tested end to end: same Python passthrough middleware used under
both --middleware ws://... and --middleware-cmd 'uv run mw.py', plus
a blocking scenario through the secret-scanner.
examples/_lib/greyproxy_middleware.py is a small shared library (no
external deps beyond the existing websockets for ws mode) that hides
the transport from middleware authors. The author writes two functions:
def handle_request(msg): ...
def handle_response(msg): ...
run(name="my-mw", handle_request=handle_request, handle_response=handle_response)
run() picks the transport at launch time. If GREYPROXY_TRANSPORT=stdio
is in the env (set by greyproxy when it spawns the child), the helper
speaks NDJSON on stdin/stdout. Otherwise it starts a WebSocket server
on $GREYPROXY_WS_PORT (default 9000). The handler code is identical.
Important stdio property: stdout is the protocol. The helper redirects
all logging to stderr on startup and replaces any preinstalled handlers
so a middleware that configured its own logger can't accidentally
corrupt the wire by writing to stdout.
Helper also ships the decision builders (allow, deny, rewrite_request,
passthrough, block, rewrite_response, decode_body) that were
copy-pasted into every previous example. Handlers that raise are
caught and fall back to allow/passthrough so the stream survives one
bad request.
Rewrote two examples to use the helper:
- middleware-passthrough-py: the canonical template. Drops from ~150
lines (inline WS server + duplicated decision helpers) to ~45 lines
of actual logic.
- middleware-secret-scanner-py: demonstrates a policy middleware. Also
smaller, and now emits tags on block so the operator sees which
pattern matched in the Activity view.
Other examples (pii-redactor, command-stripper, cost-tracker, audit-log,
rtk-compress) are intentionally left in their inline form for now —
they stand as proof that the older pattern keeps working. They can
migrate later; no urgency because the wire protocol is identical.
The Middlewares tab in Settings now shows a purple "stdio" or blue "ws" badge next to each middleware's name, so an operator can tell at a glance whether a given entry is a child process owned by greyproxy or an external WebSocket service. The URL column already showed "stdio:<cmd>" for spawned children; the badge is the quick visual cue. Small layout tweak alongside: name + kind pill are now in a flex group on the left so they stay together when the name wraps.
…tdio Three issues surfaced when trying `--middleware-cmd 'uv run examples/ middleware-rtk-compress-py/middleware.py'`: 1. Zombie children across respawns. A command like `uv run mw.py` is a wrapper — uv forks the real Python interpreter as a grandchild. stdioTransport.Close was only signalling t.cmd.Process (uv), so when uv died, Python got reparented to init and kept holding whatever ports it had bound. Next respawn failed with "address already in use" and the cycle repeated forever. Fix: set Setpgid=true on the exec.Cmd so the child starts its own process group; kill -pgid on SIGKILL so the whole subtree dies together. Split into transport_unix.go / transport_windows.go because SysProcAttr.Setpgid is unix-only. Verified end-to-end: greyproxy→uv→python tree vanishes on SIGTERM to greyproxy, port that was held by the grandchild is released immediately. 2. rtk example still embedded its own WebSocket server. Only passthrough + secret-scanner were ported to the helper library in the previous commit; rtk still had inline asyncio+websockets code. When spawned via --middleware-cmd, it opened port 9000 instead of speaking NDJSON on stdout, greyproxy's hello-read timed out, and the respawn loop hit issue (1). Fix: rewrite rtk to import `run`/`allow`/`rewrite_request`/ `decode_body` from examples/_lib and call run() at module scope. Core logic (pick_mode, rtk_compress, Anthropic/OpenAI walkers) is unchanged. Also removed three leftover `print()` debug statements that would corrupt the stdout frame stream in stdio mode. 3. Cosmetic: "middleware connected" log was showing `url=` empty for stdio entries because it read clientURLs[i] (the config URL, which is empty for command: entries) instead of c.URL() (the endpoint string the client actually uses). And the stderr prefix showed `mw[?]` because the CLI --middleware-cmd flag doesn't supply a Name. The dialer now falls back to filepath.Base(command[0]) (`mw[uv]`) until the real name arrives in the hello, and the connected-log uses c.URL() / c.Kind().
audit-log, command-stripper, cost-tracker, pii-redactor now use examples/_lib/greyproxy_middleware.py the same way passthrough, secret-scanner, and rtk-compress already do. The author writes handle_request / handle_response and calls run(); the library picks stdio or WebSocket transport based on how greyproxy launched it. Every example in the repo now supports both transports without code changes. Three additional wins from the conversion: - Inline asyncio + websockets boilerplate (roughly 40 lines per file) is gone. Each example is now the decision logic and almost nothing else. - cost-tracker, command-stripper, and pii-redactor emit tags on every mutating decision (cost.model / cost.usd / command-stripper.flags / pii.redacted / pii.restored). These show up in the Activity UI badges and in stashed event metadata, so operators can see per- request what a middleware did without reading the middleware's own logs. - All stray print() calls are gone. In stdio mode stdout is the protocol, so any print() corrupts frames; the helper forces logging to stderr but handler code still has to avoid print(). Smoke-tested all four under --middleware-cmd: each negotiates v1, declares the expected hooks, and surfaces in /api/middlewares with kind=stdio, connected=true.
…dleware The page still read as if WebSocket was the only option: the opening paragraph said "connects over a persistent WebSocket", the Overview diagram labelled the edge "JSON/WS", Quick Start started with `uv run middleware.py` in one terminal and `--middleware ws://...` in another, and the "Writing a middleware" section still described the middleware as a WebSocket server. Since stdio is now the preferred launch path for local middlewares, the whole framing needs to land the transport choice early and present stdio first in every example. Changes: - Rewrote the opening summary around "two transports, same wire protocol, pick one per middleware". Explicit guidance: stdio for local, WS for shared/remote. - Neutralised the Overview diagram: message arrows are JSON, the transport annotation sits alongside. - Quick Start leads with --middleware-cmd one-liner, then WS. - Examples table updated to seven entries (rtk-compress was missing) and the preamble now explains that the shared helper makes every example dual-transport. - Configuration section: --middleware-cmd introduced alongside --middleware, with guidance on shell semantics (no sh -c, argv split with shell-like quoting only). - YAML sample shows a command: entry first, then a url: entry, each with the config fields a real operator actually cares about (name, on_disconnect, auth_header). - Connection lifecycle: mentions both transports; adds the stdio- specific note about process-group ownership so operators understand why the child tree exits cleanly when greyproxy does. - Writing a middleware: removed the "WebSocket server" framing; now leads with the Python helper (run(handle_request=...)) and has a separate "Other languages" subsection listing the wire requirements for stdio and WS. - Multiple s/WebSocket/transport/ in places where the text said "WebSocket" but meant either framing. - Fixed the "6 examples" count → 7. No protocol change; this is a docs-only reframing.
The info block at the top of the Middlewares tab said middlewares "connect over a persistent WebSocket" and only pointed operators at --middleware — now wrong, since stdio is the preferred launch path for local deployments. Rewritten to: - lead with "two transports, same JSON protocol" - describe each transport with its CLI flag and YAML shape in a short bulleted list - link out to docs/middleware.md for the full protocol Also updated the "no middlewares configured" empty-state hint so a first-time operator sees both options. Rendered HTML verified: stdio mentioned, --middleware-cmd mentioned, stale "over a persistent WebSocket" string gone.
Previous version had a bulleted list explaining each transport and a link to docs/middleware.md. That's mini-documentation, and the docs aren't hosted online yet anyway. Reduced to a single sentence naming the three ways to configure (--middleware-cmd, --middleware, yaml) and a reminder that the list is read-only at runtime. Detailed guidance belongs in docs/middleware.md once it has a URL.
0bb8fc2 to
505606e
Compare
…cket # Conflicts: # internal/gostx/internal/util/sniffing/sniffer.go
Discard errors via `_ =` on defer Close() and `go func() { _ = c.Start() }()`
so golangci-lint passes; switch one os.Setenv in a *testing.T helper to
t.Setenv. No behaviour change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #36
Summary
External middleware services that can inspect, block, or rewrite HTTP requests and responses in real time, over a JSON protocol. Greyproxy handles all networking, TLS termination, and MITM cert generation; the middleware just sees structured JSON and returns decisions.
Two transports, same wire protocol:
Both are configured per-middleware via
--middleware-cmd '…'(stdio) and--middleware ws://…(WS), repeatable and freely mixed; YAML equivalents undergreyproxy.middlewares. Multiple middlewares cascade in declaration order; each sees the previous one's (possibly rewritten) output;deny/blockshort-circuits the chain.What's in the protocol
helloexchange with version negotiation (min_version/max_versionoverlap), declared hooks, declared filters, optional friendlyname, and amax_body_bytesopt-out for large bodies.http-request(pre-upstream) andhttp-response(post-upstream, with the original request inlined for context).host(glob),path(regex),method,content_type(glob),container(glob),tls, andllm— the last one piggybacks on greyproxy's built-in LLM dissector mapping (Anthropic/OpenAI/Google/OpenRouter + user-defined providers), so adding a provider in the UI takes effect on the next request with no middleware restart.allow,deny,passthrough,block,rewrite. Rewrite headers go through a hop-by-hop + credential denylist (Authorization,Cookie,Set-Cookie,Host,Connection,Transfer-Encoding, …) so a buggy or compromised middleware cannot silently escalate auth or reroute requests.on_disconnect: allowper middleware (recommended for observation-only middlewares).Hook points
Four call sites wired through the proxy pipeline:
Response bodies are decompressed before being sent to the middleware, and
Content-Encodingis stripped onrewriteso the client doesn't get a re-compressed-but-actually-plain body.UI
ws/stdio, name, hooks, filters, last connect status)./api/middlewaresendpoint backing it.Reliability
uv run.Examples
Seven Python examples under
examples/, each a single file runnable withuv run middleware.py. All use a small shared helper (examples/_lib/greyproxy_middleware.py) that auto-detects the transport fromGREYPROXY_TRANSPORT, so the same source runs unchanged under either stdio or WS.middleware-passthrough-pymiddleware-command-stripper-pymiddleware-pii-redactor-pymiddleware-secret-scanner-pymiddleware-cost-tracker-pymiddleware-audit-log-pymiddleware-rtk-compress-pytool_resultoutput via rtk to save context-window tokens.Documentation
docs/middleware.mdcovering both transports, the full wire protocol, version negotiation, filters, decision shapes, body handling, the three timeouts, the rewrite header denylist, and how to write a middleware in any language.Test plan
go build ./...passesgo test ./...passes (existing tests unaffected)