Add detection rules for more secret formats#2603
Add detection rules for more secret formats#2603dgageot wants to merge 2 commits intodocker:mainfrom
Conversation
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
This PR adds nine new secret-detection rules with corresponding tests. The regexes are well-structured with unique keyword prefixes to keep pre-filtering cheap. One medium-severity concern was found in the changed code — details in the inline comment.
| // anthropic-api-key. Claude keys follow | ||
| // `sk-ant-(api|sid)NN-<base64url>` and are ~108 chars long; | ||
| // the trailing "AA" is the standard base64 padding. | ||
| expression: `sk-ant-(api|sid)\d{2}-[A-Za-z0-9_-]{93}AA`, |
There was a problem hiding this comment.
[MEDIUM] Anthropic key regex hard-codes exact body length — brittle to format variations
The regex sk-ant-(api|sid)\d{2}-[A-Za-z0-9_-]{93}AA requires the payload section to be exactly 93 characters followed by a literal AA suffix. While the accompanying comment says keys are "~108 chars long" and AA is standard base64 padding, using a fixed length means any key that doesn't precisely match this layout will silently escape detection.
Real Anthropic API keys come in at least two observed formats (api03-… and sid01-…), and Anthropic could change the key length in a future rotation without notice. A more resilient pattern would use a minimum bound:
// Current (brittle — misses keys of any other length)
expression: `sk-ant-(api|sid)\d{2}-[A-Za-z0-9_-]{93}AA`
// More resilient alternative
expression: `sk-ant-(api|sid)\d{2}-[A-Za-z0-9_-]{90,100}AA`If the exact 93-char length is a deliberate high-precision choice to minimise false positives, it would help to document that reasoning in the comment so future maintainers don't wonder why it's a fixed count.
Extends
pkg/secretsscanwith nine new detection rules on top of theupstream Trivy / mcp-gateway catalogue. Each one targets a credential
format with a unique prefix so the keyword pre-filter stays cheap and
the regex's false-positive rate is low.
sk-…keys)T3BlbkFJmarkersk-ant-AIzaGOCSPX-dop_v1_whsec_AKCpAKIDsntrys_TestContainsSecretsRecognisesKnownTokensis extended to cover eachnew rule and now also asserts that
Redactremoves the raw value.The existing idempotency, linear-scaling and "marker is not a secret"
guard-rails still pass.