Expose listening ports and remote-peer scope in netstat by sergeyfast · Pull Request #12 · vmkteam/topsrv

sergeyfast · 2026-05-13T18:27:47Z

Summary

Two commits, each focused on a single concern and independently revertable.

1. `f3154a9` Add netstat listening_ports with scope/process

A new gauge topsrv_netstat_listening_ports{port, family, scope, process} emits one series per active TCP LISTEN socket — the operator-level answer to "what's actually exposed on this host?".

classifyAddr buckets the bind address into loopback (127.0.0.0/8 or ::1), private (RFC1918, RFC4193 ULA, RFC6598 CGNAT, link-local), or public (routable address or 0.0.0.0/:: wildcard — worst case). Wildcards default to public so a missing public interface doesn't hide an exposed listener.
resolveProcName attaches the owning binary name via a per-scrape PID cache; empty under kernel ACL (run agent as root for full visibility). PID=0 short-circuits without psproc.NewProcess; missing PIDs are negatively cached.
Cap at 256 series per scrape so a compromised host with thousands of listeners cannot blow Prometheus cardinality budgets — a single truncation warning fires when hit.
Fixes a small leak from Collector hardening: aliases, rollback, PG18, leak #11 (fb9bf8d): postgres integration test still indexed an appNames sub-map as bool after the map[string]time.Time switch.

2. `51ef29b` Add UDP listeners and TCP remote_scope label

Extends the same security story to UDP and outbound traffic.

listening_ports gains a proto label (tcp / udp). A UDP socket counts as "listening" when it's bound but has no connected peer (empty Raddr) — catches DNS, mDNS, WireGuard, NTP, and unexpected UDP backdoors next to TCP coverage. UDP failures are non-fatal: a log line is emitted and TCP results still ship.
topsrv_netstat_tcp_connections gains a remote_scope label that classifies the peer address with the same loopback/private/public taxonomy. LISTEN sockets carry remote_scope=none. Lets dashboards alert on:
- public inbound (scan exposure)
- private→public outbound (exfil signal)

Measured overhead

Bench on Linux (orbstack docker, arm64):

Op	Cost
Full `Collect()`	251 µs
`psnet.Connections("tcp")` alone	200 µs
Net new code	~50 µs
`resolveProcName` cold	15 µs per PID
`resolveProcName` cached	2.77 ns

UDP adds one more Connections call (~200 µs); remote_scope adds ~200 ns per non-LISTEN connection. On a host with 333 TCP conns the parse overhead is ~66 µs; on 10K conns ~2 ms — still <0.01% of a 30 s scrape budget.

Test plan

make fmt lint test — 0 lint issues, all packages green
make test-integration — full docker-compose stack passes (TestIntegrationPostgres runs against PG17)
Unit tests: TestClassifyAddr (27 cases: loopback × 3, private including RFC1918/CGNAT/ULA/link-local × 13, public × 8, garbage × 2, with just-outside boundaries), TestResolveProcName{PIDZero, MissingPID} (short-circuits and negative caching), TestNetstatCollectorListenPorts (live label/value contract)
Benchmarks: BenchmarkNetstatCollect, BenchmarkNetstatConnectionsOnly, BenchmarkResolveProcName{cold, cached}
Live manual verification on macOS — UDP nc listener on 35353 detected as proto="udp", process="nc", scope="public" alongside real system UDP (rapportd, WireGuard, mDNS); remote_scope splits TCP into four buckets (none, loopback, private, public)
Live manual verification on Linux (orbstack ubuntu under --privileged --pid=host --network=host) — TCP listeners show correct scope, orbstack-agent resolved as process, IPv6 link-local correctly classified as private
Smoke-check on production host after deploy: confirm topsrv_netstat_listening_ports{scope="public"} matches ss -tlnp / ss -ulnp ground truth

Out of scope / follow-ups

Root-bound process inventory as a security audit signal — same process_* collector, gauge of processes running as uid=0.
eBPF-based connection tracking for high-fidelity exfil detection.
maxListenSeries=256 is hardcoded — if a real host genuinely exceeds it (think container hosts running many sidecars) we'll learn from the truncation warning and revisit.

- New gauge topsrv_netstat_listening_ports{port, family, scope, process} emits one series per active TCP LISTEN socket so operators can answer "what is actually exposed?" at a glance - classifyAddr buckets the bind address into loopback, private (RFC1918, RFC4193 ULA, RFC6598 CGNAT, link-local), or public (routable address or 0.0.0.0/:: wildcard — worst case) - resolveProcName attaches the owning binary name via a per-scrape PID cache; empty under kernel ACL (run agent as root for full visibility), with negative caching for missing PIDs and PID=0 - Cap emission at 256 series per scrape so a compromised host with thousands of listeners cannot blow Prometheus cardinality budgets; log a single truncation warning when hit - Cover scope buckets, RFC1918/CGNAT just-outside boundaries, PID=0 short-circuit, and missing-PID negative caching with unit tests; add three benchmarks (full collect / Connections-only / proc-name cold+cached) measuring 251us total on Linux of which Connections=200us and the new code adds ~50us - Document the metric, scope semantics, and PromQL recipes (drift alert, public-listener inventory) in docs/metrics.md; mention listening_ports in the README Features row - Fix postgres integration test that still indexed an appNames sub-map as a bool after the recent map[string]time.Time switch

- listening_ports now carries a proto label (tcp|udp) and emits one series per UDP listener too. A UDP socket counts as "listening" when it is bound but has no connected peer (empty Raddr) — this catches DNS, mDNS, WireGuard, NTP, and unexpected UDP backdoors next to the existing TCP coverage - UDP failures are non-fatal: a log line is emitted and TCP results still ship, so a kernel quirk on one protocol can't disable both - topsrv_netstat_tcp_connections gains a remote_scope label that classifies the peer address with the same loopback/private/public taxonomy. LISTEN sockets carry remote_scope=none. Dashboards can now alert on public-inbound (scan exposure) or private->public outbound (exfil signal) without per-IP series cardinality - Update help-strings, README Features row, and PromQL recipe block in docs/metrics.md with UDP and remote_scope examples - Cover proto label and remote_scope=none-on-LISTEN invariants in the existing live-host tests; classification reuses classifyAddr so the 27-case unit table already validates the bucket logic

sergeyfast added 2 commits May 13, 2026 21:13

sergeyfast merged commit 6392a36 into master May 13, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose listening ports and remote-peer scope in netstat#12

Expose listening ports and remote-peer scope in netstat#12
sergeyfast merged 2 commits into
masterfrom
feat/netstat-listening-ports

sergeyfast commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sergeyfast commented May 13, 2026

Summary

1. f3154a9 Add netstat listening_ports with scope/process

2. 51ef29b Add UDP listeners and TCP remote_scope label

Measured overhead

Test plan

Out of scope / follow-ups

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `f3154a9` Add netstat listening_ports with scope/process

2. `51ef29b` Add UDP listeners and TCP remote_scope label