Skip to content

fix(redaction): scope long_hex_secret to credential context (#136)#159

Open
tcconnally wants to merge 3 commits into
mainfrom
fix/136-long-hex-secret-redaction
Open

fix(redaction): scope long_hex_secret to credential context (#136)#159
tcconnally wants to merge 3 commits into
mainfrom
fix/136-long-hex-secret-redaction

Conversation

@tcconnally
Copy link
Copy Markdown
Owner

Summary

Closes #136 — first item in the v1.0.6 hotfix bundle.

The pre-1.0.6 default redaction rule \b[a-fA-F0-9]{40,}\b silently destroyed every git commit hash (40 hex chars), SHA-256 sum (64 hex), Docker digest, and Atlassian content hash in any rendered output that crossed the trust boundary. @query "git log --oneline" produced [REDACTED:long_hex_secret] for every commit, with no recovery path.

Fix

The rule now requires an explicit credential anchor before matching:

(?i)(?:secret|token|key|password|passwd|api[_-]?key|auth(?:orization)?)
   \s*[:=]\s*["']?([a-fA-F0-9]{40,})["']?

Real secrets in credential context are still caught; bare hashes pass through verbatim.

Internals

  • New rule-internal field _anchor_group identifies which capture group holds the secret payload.
  • _sub() in redact_text() extended to support three modes:
    1. anchor_group=N — replace only that span, preserve everything else
    2. Legacy: group(1) is a prefix to preserve (used by bearer_header)
    3. No groups — replace whole match
  • Backward-compatible: all existing rules unchanged in behavior.

Tests

6 new regression tests in tests/test_redaction.py:

  • test_bare_git_sha1_is_not_redacted_by_defaults
  • test_bare_sha256_checksum_is_not_redacted_by_defaults
  • test_credential_anchored_hex_IS_redacted (5 cases)
  • test_credential_anchored_hex_preserves_surrounding_context
  • test_bearer_header_prefix_still_preserved (sanity)
  • test_at_query_git_log_output_survives_redaction (integration)

Test results

  • test_redaction.py: 27/27 pass (21 existing + 6 new)
  • test_edge_cases.py: parity with main confirmed (12 pass / 21 fail, identical sorted diff) — 0 net new failures introduced

CHANGELOG

New [1.0.6] — UNRELEASED section added with full migration notes.

Migration

No config breaking changes. If a user had relied on long_hex_secret redacting bare hex output (unlikely), they need to add an explicit user rule under redaction.patterns.


First of 12 PRs in the v1.0.6 milestone. Suggested next: #129 (trust profile override) or #137 (audit log secret leak).

The pre-1.0.6 default rule `\b[a-fA-F0-9]{40,}\b` silently destroyed git
commit hashes (40 hex chars), SHA-256 sums (64 hex chars), Docker digests,
and Atlassian content hashes in any rendered output that crossed the trust
boundary. `@query "git log --oneline"` produced `[REDACTED:long_hex_secret]`
for every commit hash, with no recovery path.

The rule now requires an explicit credential anchor before matching:
  (?i)(?:secret|token|key|password|passwd|api[_-]?key|auth(?:orization)?)
  \s*[:=]\s*["']?([a-fA-F0-9]{40,})["']?

Real secrets in credential context are still caught; bare hashes pass through.

Internals:
- New `_anchor_group` field on rule dicts identifies which capture group
  holds the secret payload. _sub() in redact_text() replaces only that span
  while preserving surrounding context verbatim.
- Legacy prefix-preserve behavior (group(1) is a prefix) is retained for
  bearer_header and other existing rules — controlled by absence of
  _anchor_group.

Tests (tests/test_redaction.py):
- test_bare_git_sha1_is_not_redacted_by_defaults
- test_bare_sha256_checksum_is_not_redacted_by_defaults
- test_credential_anchored_hex_IS_redacted
- test_credential_anchored_hex_preserves_surrounding_context
- test_bearer_header_prefix_still_preserved
- test_at_query_git_log_output_survives_redaction

All 27 redaction tests pass. test_edge_cases.py parity vs main confirmed
(0 net new failures).

Closes #136
Refs milestone v1.0.6
@tcconnally tcconnally added this to the v1.0.6 milestone Jun 3, 2026
@tcconnally tcconnally added bug Something isn't working security P1-high labels Jun 3, 2026
tcconnally added a commit that referenced this pull request Jun 4, 2026
Comprehensive handoff for the v1.0.6 security + correctness hotfix
cycle (2026-06-03):

- Summarizes all 10 open PRs (#159-#164, #170-#173) with closes-issue,
  severity, suggested merge order, and known conflicts.
- Documents the 7 residual issues NOT yet PR'd (#138, #140, #167,
  #130, #135, #141, #142) with size estimates.
- Captures architectural notes: the _provenance pattern, build-
  artifact drift trap, @Perseus header requirement, git stash + built
  artifact hazard.
- Lists 97/97 new regression tests and pre-existing failure baseline
  for parity-checking.
- Provides release mechanics: tagging, smoke testing, security
  advisory recommendations with CVSS estimates.
- Lists mistakes I made this session (5 of them) and open questions
  for the maintainer (5 of them).
- Appendix A: side-by-side diff of my review's findings vs Codex's
  findings — shows Codex caught 5 Criticals my review missed, all of
  which now have PRs.

Lives at HANDOFF-v1.0.6.md (existing HANDOFF.md is the Phase 24
extensibility-spec handoff from 2026-05-24 and is preserved).

Co-authored-by: Thomas Connally <tconnally@atlassian.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working P1-high security

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: long_hex_secret redaction rule corrupts git hashes in rendered output

2 participants