[REVIEW] sast-config: CWE-matrix false gaps, no CodeQL build-completeness check, baseline-suppression blind spot

## Skill Being Reviewed
**Skill name:** `sast-config`
**Skill path:** `skills/devsecops/sast-config/`

## False Positive Analysis

**Benign code that triggers a false positive:**
```yaml
# .github/codeql/codeql-config.yml — a CodeQL run that REPORTS zero alerts
# while actually analyzing almost nothing, because autobuild silently
# half-compiled a Java module. The skill's CI review would mark this "passing".
queries:
  - uses: security-extended
# (no failure surfaced; database built from 12% of sources)
```
And a CWE-coverage example:
```
CWE-352 (CSRF): 0 active SAST rules  -> skill flags this as a High/Medium GAP
```

**Why this is a false positive:**
1. **The CWE Top 25 coverage matrix produces false "gap" findings for weakly-SAST-detectable CWEs.** The matrix lists CSRF (CWE-352) and Unrestricted Upload (CWE-434) as detectable and ties "CWE Top 10 with zero coverage = High." But CSRF is fundamentally a framework/runtime property that SAST detects poorly and noisily; out-of-bounds write/read and use-after-free (787/125/416) are only meaningfully reachable in C/C++ with heavy analysis. Penalizing a Python/JS team for "0 rules on CWE-352/787" reports gaps for weaknesses SAST cannot reliably cover — a false finding driven by the matrix, not the codebase.
2. **`Semgrep ERROR → Critical/High → block merge` is a uniform mapping that over-blocks.** Severity in Semgrep rules is author-assigned; community/registry rules routinely ship `ERROR` for medium-real-risk patterns. Section 5.1 and Pitfall #3 hardwire ERROR to "block CI," which mass-blocks on noisy rulesets — directly contradicting the skill's own false-positive-reduction goal.
3. **Blanket test-path exclusion treated as always safe.** 3.1 says excluding `test/`, `*.test.js`, `*.spec.py` is "acceptable." But security-relevant logic and *vendored* third-party code often live under excluded paths, and `vendor/`/`node_modules` exclusion (correct for first-party SAST) silently drops copied third-party source that is a real attack surface. Presenting these exclusions as unconditionally safe under-reports.

## Coverage Gaps

**Missed variant 1:**
```yaml
# CodeQL matrix for a compiled language with no build-success verification
strategy:
  matrix:
    language: [java, javascript]
steps:
  - uses: github/codeql-action/autobuild@v3   # may silently fail for java
  - uses: github/codeql-action/analyze@v3     # reports "0 alerts" on a near-empty DB
```
**Why it should be caught:**
For compiled languages (Java, C#, C/C++, Kotlin), CodeQL only analyzes code it successfully **builds**. If `autobuild` partially fails, the database covers a fraction of the code and `analyze` cheerfully reports few/zero alerts — a high-severity **false negative that looks like success**. The Step 6 CI review checks that scans *run* and are *required*, but never verifies build success, database completeness, or that every matrix language actually produced a populated database. This is the single most common way CodeQL "passes" while covering nothing.

**Missed variant 2:**
```bash
# semgrep ci with diff-aware/baseline suppression hides pre-existing, unfixed findings forever
semgrep ci --baseline-commit "$(git merge-base origin/main HEAD)"
# A SQLi present before the baseline is never surfaced in any PR.
```
**Why it should be caught:**
The skill recommends `semgrep ci` and PR-scoped scanning but never addresses **baseline/diff-aware suppression of known-but-unfixed findings**. Pitfall #1 notes incremental scans miss cross-file flows, but the larger issue is that baseline mode permanently hides pre-existing debt: a vulnerability older than the baseline never appears in PR checks and, without a tracked full-scan backlog, is invisible. The review needs a check that scheduled full scans exist *and* that their findings are tracked to closure, not just that a weekly cron is present.

## Edge Cases

- **Unscoped `# nosemgrep` is a silent global suppression.** A bare `# nosemgrep` (no rule id) disables *all* rules on that line, not just the intended one. The skill's example correctly scopes the suppression, but it never warns that unscoped suppressions are a coverage hole — exactly what a SAST-config review should flag. Same applies to overly broad `query-filters` `exclude` by tag.
- **FP-rate target (<20%) is unmeasurable and perverse.** Without ground-truth labeling there is no true denominator; setting a numeric FP target incentivizes disabling noisy-but-valid rules to "hit the number," undermining coverage. Should be framed as a triaged-trend, not a hard threshold.
- **`security-and-quality` vs the <10-min PR budget.** 4.1 recommends `security-and-quality` for "maximum coverage," but that suite adds maintainability/quality queries that inflate noise and runtime — colliding with the skill's own "under 10 minutes for PR checks" guidance. The trade-off (full suite on schedule, leaner suite on PR) is unstated.

## Remediation Quality
- [x] Fix resolves the vulnerability
- [ ] Fix doesn't introduce new security issues
- [x] Fix doesn't break functionality
- **Issues found:** The CI patterns and custom-rule checklists are strong, but two of the *example custom rules* themselves illustrate Pitfall #4 (low coverage):
  - `custom.auth.jwt-none-algorithm` matches `algorithm="none"` literally — it misses `None`/`NONE` case variants and the algorithm arriving via a variable, so it under-detects the exact bug it targets.
  - `custom.auth.hardcoded-admin-bypass` matches only `if $USER == "admin": return True` — it won't catch role-field comparisons, `in` checks, or `!=` inversions, making it a narrow rule presented as a model.
  Recommended changes: (1) mark weakly-SAST-detectable CWEs (352/434/787/416/125) as "low SAST confidence" in the matrix and don't drive High/Medium gap severity from them; (2) add a CodeQL **build-success / database-completeness** verification to Step 6; (3) add **baseline-suppression backlog tracking** and an **unscoped-`nosemgrep`** check; (4) condition severity on rule confidence rather than mapping all ERROR → block.

## Comparison to Other Tools

| Tool | Catches this? | Notes |
|------|:---:|-------|
| Semgrep (managed/Pro) | Partial | Pro adds cross-file taint and confidence metadata that could drive the severity-by-confidence fix this review recommends. |
| CodeQL | Partial | Powerful taint engine, but the build-completeness false-negative is a CodeQL-specific failure mode the skill omits. |
| SonarQube | Partial | Good FP-management UX and quality gates; mixes quality+security, echoing the noise trade-off noted here. |
| Snyk Code (DeepCode) | Yes | Interfile analysis without a build step — avoids the CodeQL autobuild false-negative for compiled languages. |
| OpenSSF Scorecard | Partial | Checks CI hardening (pinned actions, required checks) complementary to this skill's Step 6. |

## Overall Assessment

**Strengths:**
- Genuinely useful CWE Top 25 ↔ rule mapping concept, concrete Semgrep/CodeQL authoring checklists, and solid CI integration patterns with branch-protection and scheduling guidance.
- Pitfalls are on-point (disable-vs-fix, severity flattening, test custom rules against safe+vulnerable corpora).
- Clear ASVS mapping and a clean assessment output template.

**Needs improvement:**
- Coverage matrix drives false "gap" findings for CWEs SAST detects poorly (CSRF/OOB/UAF).
- No CodeQL build-success/database-completeness check → silent false negatives that look like passing scans.
- No handling of baseline/diff-aware suppression hiding pre-existing findings, or of unscoped `nosemgrep`.
- Uniform `ERROR → block` and a hard <20% FP target work against the skill's own FP-reduction aim; two example custom rules are themselves low-coverage.

**Priority recommendations:**
1. Add CodeQL build-success/database-completeness verification to the CI review (highest-impact false-negative).
2. Re-tier the CWE matrix by SAST confidence so weakly-detectable CWEs don't generate gap findings, and map severity by rule confidence rather than raw ERROR.
3. Add baseline-suppression backlog tracking and an unscoped-`nosemgrep` coverage-hole check.

## Bounty Info
- [x] I have read and agree to the [CONTRIBUTING.md](../../CONTRIBUTING.md) bounty terms
- **Preferred payment method:** Crypto, preferably Base USDC to `0x1b58008Fde85832845817F9B24E64d139E418EeF`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] sast-config: CWE-matrix false gaps, no CodeQL build-completeness check, baseline-suppression blind spot #1145

Skill Being Reviewed

False Positive Analysis

Coverage Gaps

Edge Cases

Remediation Quality

Comparison to Other Tools

Overall Assessment

Bounty Info

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tool	Catches this?	Notes
Semgrep (managed/Pro)	Partial	Pro adds cross-file taint and confidence metadata that could drive the severity-by-confidence fix this review recommends.
CodeQL	Partial	Powerful taint engine, but the build-completeness false-negative is a CodeQL-specific failure mode the skill omits.
SonarQube	Partial	Good FP-management UX and quality gates; mixes quality+security, echoing the noise trade-off noted here.
Snyk Code (DeepCode)	Yes	Interfile analysis without a build step — avoids the CodeQL autobuild false-negative for compiled languages.
OpenSSF Scorecard	Partial	Checks CI hardening (pinned actions, required checks) complementary to this skill's Step 6.

[REVIEW] sast-config: CWE-matrix false gaps, no CodeQL build-completeness check, baseline-suppression blind spot #1145

Description

Skill Being Reviewed

False Positive Analysis

Coverage Gaps

Edge Cases

Remediation Quality

Comparison to Other Tools

Overall Assessment

Bounty Info

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions