Release v0.1.2: nginx URI cap + smaller backup retention#16
Merged
Conversation
Production gateway logs were flooded with label_value_too_long for
topsrv_nginx_*_total{uri=...} because normalizePath collapsed by depth
and regex but never bounded the resulting byte length, and the
control-byte check missed nginx's own escape format (TLS handshakes
on port 80 land in $uri as printable `\xHH` text, not raw bytes).
- normalizePath now hard-caps the normalized URI at 240 bytes with a
restMarker suffix, leaving the standard 16-byte headroom before the
Prometheus 256-byte label limit. truncation walks back to a UTF-8
rune boundary via a new truncateAtRune helper so a Cyrillic / CJK
tail is never cut mid-rune and shipped as invalid UTF-8
- Detect the literal `\x` escape sequence in $uri and map to /:invalid.
nginx writes non-printable bytes as printable `\xHH` text before
logging, which made the existing path[i] < 0x20 check a no-op against
TLS ClientHello probes — those were the bulk of the >256-byte URIs
hitting the gateway
- Collapse 32+ char base64url-charset segments containing at least one
uppercase letter to /:rest. Random tokens (session ids, email magic
links) are mixed-case; hyphenated product/article slugs are not, so
the uppercase requirement keeps legitimate long-transliteration URLs
readable while hiding unbounded cardinality
- Pre-cap incoming \$uri at 1KB. nginx already enforces
large_client_header_buffers (default 8KB) but the regex pipeline
shouldn't trust upstream and shouldn't amplify pathological input
- Single byte-scan now covers control bytes, the `\x` escape, and an
uppercase tracker. The hasUpper flag piggy-backs the same loop and
also skips strings.ToLower on the scanner-suffix check, eliminating
one allocation per line on the dominant all-lowercase path
- Extract restMarker and invalidMarker sentinel consts so call sites
and the s[:N-len(restMarker)] truncation arithmetic stay in sync,
and update truncatePath to use the same const
- Cover the new branches in TestNormalizePath (\`\x\` escape, long
mixed-case token, lowercase-slug exemption, oversize raw input) plus
dedicated TestNormalizePath_ByteCap that verifies UTF-8 validity
after the 240-byte cap on a 300-rune Cyrillic input, and a
TestTruncateAtRune unit table
- Add BenchmarkNormalizePath with 14 representative inputs covering
every branch (clean, mixed-case clean, scanner, TLS handshake, long
translit slug, long base64 token, byte-cap overflow) so future
changes can be measured against a stable baseline
The updater stores every shipped binary in <binDir>/.topsrv-backup/ so a crash-loop can roll back to a prior version. Five rolling copies were oversized for the actual rollback scenarios — only the immediately preceding binary is ever picked by attemptRollback, and on hosts with frequent updates the older copies just consume disk for no operational gain. - updateMaxBackups drops from 5 to 2: the live binary is always staged for crash-loop rollback, and one preceding copy covers a manual revert. Existing TestBackupAndTrim and TestTrimBackupsVersionOrder reference the constant directly, so the trim invariant is still validated end-to-end without test changes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
v0.1.2
Fixes
label_value_too_longfrom the downstream Prometheus gateway.\xHHprintable text from TLS handshakeson port 80) and route to
/:invalidinstead of inflating per-URImetric cardinality.
/:rest(session ids, magic links). All-lowercase slugs stay unchanged.
Operability
Test plan
make fmt lint— 0 issuesgo test ./...— all packages greenBenchmarkNormalizePath— no regression on all-lowercase paths; TLS handshake −99.9%TestNormalizePath_ByteCap— UTF-8 validity preserved on 300-rune Cyrillic input