Skip to content

Release v0.1.2: nginx URI cap + smaller backup retention#16

Merged
sergeyfast merged 2 commits into
masterfrom
release/v0.1.2
May 19, 2026
Merged

Release v0.1.2: nginx URI cap + smaller backup retention#16
sergeyfast merged 2 commits into
masterfrom
release/v0.1.2

Conversation

@sergeyfast
Copy link
Copy Markdown
Contributor

v0.1.2

Fixes

  • Hard-cap nginx URI labels at 240 bytes (UTF-8-safe trim) — stops
    label_value_too_long from the downstream Prometheus gateway.
  • Detect nginx-escaped binary (\xHH printable text from TLS handshakes
    on port 80) and route to /:invalid instead of inflating per-URI
    metric cardinality.
  • Collapse 32+ char base64url-like segments with uppercase to /:rest
    (session ids, magic links). All-lowercase slugs stay unchanged.

Operability

  • Auto-update keeps 2 previous binaries on disk instead of 5.

Test plan

  • make fmt lint — 0 issues
  • go test ./... — all packages green
  • BenchmarkNormalizePath — no regression on all-lowercase paths; TLS handshake −99.9%
  • TestNormalizePath_ByteCap — UTF-8 validity preserved on 300-rune Cyrillic input

Production gateway logs were flooded with label_value_too_long for
topsrv_nginx_*_total{uri=...} because normalizePath collapsed by depth
and regex but never bounded the resulting byte length, and the
control-byte check missed nginx's own escape format (TLS handshakes
on port 80 land in $uri as printable `\xHH` text, not raw bytes).

- normalizePath now hard-caps the normalized URI at 240 bytes with a
  restMarker suffix, leaving the standard 16-byte headroom before the
  Prometheus 256-byte label limit. truncation walks back to a UTF-8
  rune boundary via a new truncateAtRune helper so a Cyrillic / CJK
  tail is never cut mid-rune and shipped as invalid UTF-8
- Detect the literal `\x` escape sequence in $uri and map to /:invalid.
  nginx writes non-printable bytes as printable `\xHH` text before
  logging, which made the existing path[i] < 0x20 check a no-op against
  TLS ClientHello probes — those were the bulk of the >256-byte URIs
  hitting the gateway
- Collapse 32+ char base64url-charset segments containing at least one
  uppercase letter to /:rest. Random tokens (session ids, email magic
  links) are mixed-case; hyphenated product/article slugs are not, so
  the uppercase requirement keeps legitimate long-transliteration URLs
  readable while hiding unbounded cardinality
- Pre-cap incoming \$uri at 1KB. nginx already enforces
  large_client_header_buffers (default 8KB) but the regex pipeline
  shouldn't trust upstream and shouldn't amplify pathological input
- Single byte-scan now covers control bytes, the `\x` escape, and an
  uppercase tracker. The hasUpper flag piggy-backs the same loop and
  also skips strings.ToLower on the scanner-suffix check, eliminating
  one allocation per line on the dominant all-lowercase path
- Extract restMarker and invalidMarker sentinel consts so call sites
  and the s[:N-len(restMarker)] truncation arithmetic stay in sync,
  and update truncatePath to use the same const
- Cover the new branches in TestNormalizePath (\`\x\` escape, long
  mixed-case token, lowercase-slug exemption, oversize raw input) plus
  dedicated TestNormalizePath_ByteCap that verifies UTF-8 validity
  after the 240-byte cap on a 300-rune Cyrillic input, and a
  TestTruncateAtRune unit table
- Add BenchmarkNormalizePath with 14 representative inputs covering
  every branch (clean, mixed-case clean, scanner, TLS handshake, long
  translit slug, long base64 token, byte-cap overflow) so future
  changes can be measured against a stable baseline
The updater stores every shipped binary in <binDir>/.topsrv-backup/ so
a crash-loop can roll back to a prior version. Five rolling copies were
oversized for the actual rollback scenarios — only the immediately
preceding binary is ever picked by attemptRollback, and on hosts with
frequent updates the older copies just consume disk for no operational
gain.

- updateMaxBackups drops from 5 to 2: the live binary is always staged
  for crash-loop rollback, and one preceding copy covers a manual
  revert. Existing TestBackupAndTrim and TestTrimBackupsVersionOrder
  reference the constant directly, so the trim invariant is still
  validated end-to-end without test changes
@sergeyfast sergeyfast merged commit fecf14f into master May 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant