Skip to content

Release 0.2.7: patch rstr {N>100} bug, fixes cloudflare_origin_ca_key#15

Merged
slima4 merged 2 commits into
mainfrom
release/0.2.7
Apr 21, 2026
Merged

Release 0.2.7: patch rstr {N>100} bug, fixes cloudflare_origin_ca_key#15
slima4 merged 2 commits into
mainfrom
release/0.2.7

Conversation

@slima4
Copy link
Copy Markdown
Member

@slima4 slima4 commented Apr 21, 2026

Summary

Fixes the last of the 6 community.json rules the other team flagged: cloudflare_origin_ca_key now generates 30/30 real matches instead of being skipped. Root cause was an rstr bug that misapplied an unbounded-quantifier cap to fixed-count {N} repetitions.

Root cause

rstr 3.2.x's Xeger._handle_repeat:

end_range = min((end_range, STAR_PLUS_LIMIT))  # STAR_PLUS_LIMIT = 100
times = self._random.randint(start_range, end_range)

Meant to bound unbounded */+ quantifiers. But for a fixed {146}, sre_parse passes start=146, end=146; rstr computes min(146, 100) = 100 and then randint(146, 100) raises ValueError: empty range. Real-world trigger: gitleaks rule cloudflare_origin_ca_key matches v1.0-<24 hex>-<146 hex>.

Fix

Monkeypatch rstr.xeger.Xeger._handle_repeat at crossfire.generator import so the cap only applies when it still leaves end_range >= start_range. Preserves STAR_PLUS_LIMIT intent for unbounded quantifiers and wide variable ranges; fixed-count repetitions above the cap produce their exact count.

if end_range > STAR_PLUS_LIMIT:
    end_range = max(STAR_PLUS_LIMIT, start_range)

Results on community.json (all 90 rules, 30 samples each)

Rule 0.2.6 0.2.7
ssh_private_key 30 real 30 real
private_key_pem 30 real 30 real
inst_tag 30 real 30 real
atlassian_api_token 30 real 30 real
cloudflare_origin_ca_key 0 (skipped) 30 real
kubernetes_secret_yaml 27-30 degenerate 30 still degenerate (separate rstr wildcard issue)

Known limitation (disclosed)

kubernetes_secret_yaml now generates 30 samples, but the middle span generated by (?s:.){0,100}? is random control-character garbage identical across samples. Fixing needs either rule-aware sampling or a structured-input generator; out of scope here and acknowledged as nice-to-have by the other team.

Test plan

  • pytest — 268 passed (adds 4 regression tests in TestRstrRepeatPatch)
  • ruff check, ruff format --check, mypy --strict crossfire/ clean
  • End-to-end verified: community.json now produces samples for all 90 rules (was 89/90 on 0.2.6, 88/90 on 0.2.5)

Breaking changes

None. The rstr monkeypatch affects only the rstr.xeger.Xeger class in this process after importing crossfire.generator; callers that import crossfire.generator get the fix automatically. Patch stays until a fixed rstr release is out.

Release note

After merge: tag v0.2.7 and push to trigger PyPI.

…_ca_key

rstr 3.2.x caps end_range at STAR_PLUS_LIMIT=100 unconditionally in
_handle_repeat. For fixed counts like {146} this pushes end below start
and randint raises. Monkeypatch Xeger._handle_repeat at module import so
the cap only applies when it leaves end>=start. Cloudflare-style rules
with {146} now generate 30/30 real matches.

kubernetes_secret_yaml still has low-diversity samples (rstr fills
(?s:.){0,100}? with control-char garbage); disclosed as known limitation.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@slima4 slima4 merged commit 2b7247d into main Apr 21, 2026
3 checks passed
@slima4 slima4 deleted the release/0.2.7 branch April 21, 2026 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant