Expose listening ports and remote-peer scope in netstat#12
Merged
Conversation
- New gauge topsrv_netstat_listening_ports{port, family, scope,
process} emits one series per active TCP LISTEN socket so
operators can answer "what is actually exposed?" at a glance
- classifyAddr buckets the bind address into loopback, private
(RFC1918, RFC4193 ULA, RFC6598 CGNAT, link-local), or public
(routable address or 0.0.0.0/:: wildcard — worst case)
- resolveProcName attaches the owning binary name via a per-scrape
PID cache; empty under kernel ACL (run agent as root for full
visibility), with negative caching for missing PIDs and PID=0
- Cap emission at 256 series per scrape so a compromised host
with thousands of listeners cannot blow Prometheus cardinality
budgets; log a single truncation warning when hit
- Cover scope buckets, RFC1918/CGNAT just-outside boundaries,
PID=0 short-circuit, and missing-PID negative caching with unit
tests; add three benchmarks (full collect / Connections-only /
proc-name cold+cached) measuring 251us total on Linux of which
Connections=200us and the new code adds ~50us
- Document the metric, scope semantics, and PromQL recipes (drift
alert, public-listener inventory) in docs/metrics.md; mention
listening_ports in the README Features row
- Fix postgres integration test that still indexed an appNames
sub-map as a bool after the recent map[string]time.Time switch
- listening_ports now carries a proto label (tcp|udp) and emits one series per UDP listener too. A UDP socket counts as "listening" when it is bound but has no connected peer (empty Raddr) — this catches DNS, mDNS, WireGuard, NTP, and unexpected UDP backdoors next to the existing TCP coverage - UDP failures are non-fatal: a log line is emitted and TCP results still ship, so a kernel quirk on one protocol can't disable both - topsrv_netstat_tcp_connections gains a remote_scope label that classifies the peer address with the same loopback/private/public taxonomy. LISTEN sockets carry remote_scope=none. Dashboards can now alert on public-inbound (scan exposure) or private->public outbound (exfil signal) without per-IP series cardinality - Update help-strings, README Features row, and PromQL recipe block in docs/metrics.md with UDP and remote_scope examples - Cover proto label and remote_scope=none-on-LISTEN invariants in the existing live-host tests; classification reuses classifyAddr so the 27-case unit table already validates the bucket logic
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two commits, each focused on a single concern and independently revertable.
1.
f3154a9Add netstat listening_ports with scope/processA new gauge
topsrv_netstat_listening_ports{port, family, scope, process}emits one series per active TCP LISTEN socket — the operator-level answer to "what's actually exposed on this host?".classifyAddrbuckets the bind address intoloopback(127.0.0.0/8 or ::1),private(RFC1918, RFC4193 ULA, RFC6598 CGNAT, link-local), orpublic(routable address or 0.0.0.0/:: wildcard — worst case). Wildcards default topublicso a missing public interface doesn't hide an exposed listener.resolveProcNameattaches the owning binary name via a per-scrape PID cache; empty under kernel ACL (run agent as root for full visibility). PID=0 short-circuits withoutpsproc.NewProcess; missing PIDs are negatively cached.fb9bf8d): postgres integration test still indexed anappNamessub-map asboolafter themap[string]time.Timeswitch.2.
51ef29bAdd UDP listeners and TCP remote_scope labelExtends the same security story to UDP and outbound traffic.
listening_portsgains aprotolabel (tcp/udp). A UDP socket counts as "listening" when it's bound but has no connected peer (emptyRaddr) — catches DNS, mDNS, WireGuard, NTP, and unexpected UDP backdoors next to TCP coverage. UDP failures are non-fatal: a log line is emitted and TCP results still ship.topsrv_netstat_tcp_connectionsgains aremote_scopelabel that classifies the peer address with the sameloopback/private/publictaxonomy. LISTEN sockets carryremote_scope=none. Lets dashboards alert on:Measured overhead
Bench on Linux (orbstack docker, arm64):
Collect()psnet.Connections("tcp")aloneresolveProcNamecoldresolveProcNamecachedUDP adds one more
Connectionscall (~200 µs);remote_scopeadds ~200 ns per non-LISTEN connection. On a host with 333 TCP conns the parse overhead is ~66 µs; on 10K conns ~2 ms — still <0.01% of a 30 s scrape budget.Test plan
make fmt lint test— 0 lint issues, all packages greenmake test-integration— full docker-compose stack passes (TestIntegrationPostgresruns against PG17)TestClassifyAddr(27 cases: loopback × 3, private including RFC1918/CGNAT/ULA/link-local × 13, public × 8, garbage × 2, with just-outside boundaries),TestResolveProcName{PIDZero, MissingPID}(short-circuits and negative caching),TestNetstatCollectorListenPorts(live label/value contract)BenchmarkNetstatCollect,BenchmarkNetstatConnectionsOnly,BenchmarkResolveProcName{cold, cached}proto="udp", process="nc", scope="public"alongside real system UDP (rapportd, WireGuard, mDNS);remote_scopesplits TCP into four buckets (none,loopback,private,public)--privileged --pid=host --network=host) — TCP listeners show correct scope,orbstack-agentresolved as process, IPv6 link-local correctly classified as privatetopsrv_netstat_listening_ports{scope="public"}matchesss -tlnp/ss -ulnpground truthOut of scope / follow-ups
process_*collector, gauge of processes running as uid=0.maxListenSeries=256is hardcoded — if a real host genuinely exceeds it (think container hosts running many sidecars) we'll learn from the truncation warning and revisit.