Skip to content

Protocol coverage gaps: UDP, FTP, QUIC, HTTP/3, WebRTC, raw IP #313

@schmitthub

Description

@schmitthub

Summary

Several common protocols cannot egress at all (silently broken) or egress with no L7 inspection (opaque TCP) under the current firewall design. None of these are regressions; they are the surface clawker today simply doesn't cover. Filed as a single tracker so the gaps are visible and we can decide which to close vs. acknowledge as out-of-scope.

Surveyed during adversarial sweep on feat/networking-dashboard.

Where today's enforcement lives

  • L4 routing decisions: internal/controlplane/firewall/ebpf/bpf/clawker.c + bpf/common.h
    • decide_connect (TCP + connected UDP) — DNS redirect, intra-subnet passthrough, gateway lockdown, per-domain route_map lookup, catch-all → Envoy egress port
    • decide_sendmsg (unconnected UDP) — DNS redirect only, everything else V_DENY
    • connect6 — IPv4-mapped goes through decide_connect; native IPv6 denied outright
    • sock_create — denies SOCK_RAW
  • L7 routing: Envoy filter chains per EgressRule.proto:
    • https → TLS-MITM HCM (full L7 + path rules)
    • http → plaintext HCM (full L7 + path rules)
    • ssh / tcp / other → opaque TCP listener (no L7 inspection, no path rules, no action route metadata)

Gap inventory

1. Unconnected UDP (non-DNS) → silent deny

decide_sendmsg (common.h:818-839) denies every UDP datagram except dst_port==53. There is no proto: udp rule type and no Envoy UDP listener. Practical impact:

  • NTP (udp/123), STUN/TURN over UDP, QUIC, DTLS, mDNS, SSDP, syslog, custom UDP telemetry — all blocked, no way to allow-list.
  • Surfaces to user as silent application failure (sendmsg returns success-shaped no-delivery on some kernels; EPERM on others).

2. QUIC / HTTP/3

QUIC rides on UDP. Per (1), blocked. Modern clients (Chrome, gRPC, Cloudflare's quic-go) negotiate via Alt-Svc and may degrade to h2-over-TCP; some pin QUIC and just fail. We have:

  • No UDP listener in Envoy
  • No QUIC support in Envoy build (would need --define quic=enabled + UDP downstream listener)
  • No SNI inspection equivalent for QUIC's CRYPTO frames (encrypted ClientHello in QUICv2 makes this hard regardless)

Even if we add UDP egress, HTTP/3 path-rule enforcement is a separate engineering effort.

3. Connected UDP intra-subnet passthrough

decide_connect passes through dest IPs inside the agent's own subnet (is_in_subnet check, common.h:751). This is the load-bearing fast path for agent → Envoy / CoreDNS / host-proxy. Side effect: any UDP connect() + send() to a destination on clawker-net bypasses both decide_sendmsg and Envoy. Today there is no untrusted tenant on clawker-net (CP / Envoy / CoreDNS / OS / Prometheus / OTel collector only — strong auth is the boundary), so this is acceptable. Worth tracking because:

  • if any non-infra container ever joins clawker-net (as the adversarial harness's clawker-test-attacker did), the path becomes exploitable
  • the asymmetry (connected UDP intra-net allowed, unconnected UDP intra-net V_DENY at sendmsg4) is a footgun

4. ICMP

sock_create denies SOCK_RAW (clawker.c:325). Unprivileged ping uses SOCK_DGRAM + IPPROTO_ICMP which does not go through sock_create-as-raw-deny and is not currently routed through decide_sendmsg. Within clawker-net, ICMP works (verified from clawker.clawker.devclawker-test-attacker). Outside clawker-net depends on kernel version + net.ipv4.ping_group_range and was not probed.

Even if we wanted to fully block ICMP, there's no per-rule proto: icmp allow path either — so today it's "blocked-by-default outside the subnet, allowed everywhere inside it, no inspection knob."

5. FTP (active + passive)

  • Active FTP: data channel on a server-chosen port back to the client → inbound, blocked by Docker NAT, unrelated to firewall
  • Passive FTP: data channel on a server-chosen port to a server-chosen IP. Today this would land on proto: tcp (opaque listener) with no path rules and no payload inspection. The control channel (tcp/21) would have to be allow-listed separately and the dynamic data port allow-listed via a port range — EgressRule has no port-range field. So FTP can't actually be allow-listed in practice.

6. Opaque-TCP rules have no L7 surface

proto: tcp / proto: ssh / unrecognized protos route to buildTCPListener — a bare tcp_proxy with no L7 filter chain. Implications:

  • No SNI inspection on proto: tcp rules even when the wrapped protocol is TLS-based (e.g. SMTPS, IMAPS, custom TLS-on-non-443)
  • No action route metadata on those access-log records (TCP filter chains hardcode action, but every per-rule listener is allowed by construction — no path-level deny granularity)
  • No path_rules / path_default enforcement
  • LLM training data is weak on Envoy tcp_proxy SNI inspection — we explicitly chose the per-rule listener over a single TLS-inspector chain. Revisit if SNI-routed TCP becomes a need.

7. SMTP / IMAP / POP / SCP-over-arbitrary-port

Same shape as (6). Can be allow-listed as opaque TCP if the destination port is known + static. STARTTLS-style upgrades cannot be inspected. ESMTP submission (587), IMAPS (993), POP3S (995), SCP/SFTP (tcp/22 works via proto: ssh).

8. WebRTC

UDP + DTLS + SCTP-over-DTLS. Blocked at (1). Even with UDP egress, WebRTC ICE candidate gathering needs STUN/TURN over UDP and the firewall would have to allow-list dynamically negotiated peer IPs (out of scope).

9. WebSocket over HTTP/3

Requires QUIC. Same blocker as (2). WebSocket over HTTP/1.1 and HTTP/2 (RFC 8441 Extended CONNECT — currently untested, see .claude/rules/envoy.md) are the only WS surfaces today.

10. Native IPv6

Denied outright in clawker_connect6 (line 212). Documented design. Listed here for completeness.

11. Raw sockets

Denied in sock_create (line 325). Documented design. Listed for completeness — closes ICMP-raw, custom-IP, and a class of fingerprinting tooling.

What's NOT a gap

Behavior Why not a gap
ICMP between agent and clawker-net infra Intra-subnet passthrough is load-bearing; clawker-net is closed infra
Plaintext :80 to a proto: https rule Deny chain RST is correct — port/proto mismatch is a deny
HTTP/2 downstream WS upgrade fails Documented Envoy limitation; clients fall back to h1.1
Long streams without idle timeout Deliberate revert in 6197d4b3 for LLM streaming workloads

Proposed slicing

If we tackle these, suggested order:

  1. proto: udp rule type + Envoy UDP listener — closes (1), (2-listener-part), (8-egress-part). Per-destination only; no SNI/path enforcement yet.
  2. Per-rule port ranges — closes (5) and the dynamic-port leg of WebRTC. Schema change on EgressRule + Envoy listener generation.
  3. SNI-inspecting TCP chain for proto: tcp — closes the SNI-only surface of (6) (SMTPS, IMAPS, etc.). Adds path_rules only if we map them to SNI patterns instead of HTTP paths.
  4. HTTP/3 path-rule enforcement — biggest lift. Envoy QUIC build + downstream UDP listener + QUIC HCM + (probably) ECH handling. Defer unless we have a concrete user.

(1) and (2) together cover the majority of practical UDP/FTP/QUIC pain. (3) and (4) are bigger commitments.

Verification anchors (already in tree)

  • internal/controlplane/firewall/ebpf/bpf/clawker.c + bpf/common.h — L4 decision matrix
  • internal/controlplane/firewall/envoy_config.go — Envoy filter-chain shapes by proto:
  • .claude/rules/envoy.md — current Envoy architecture, including the "Extended CONNECT — Untested" section already calls out the HTTP/2 + WS limitation
  • test/adversarial/CLAUDE.md — adversarial harness with UDP/ICMP listeners ready to validate any fix

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestroadmapPlanned featuressecuritySecurity hardening or fixes

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions