Skip to content

fix: strengthen Research Pack validator — add Stop condition check, strip code blocks, detect empty sections#79

Merged
ShadyUnderLight merged 4 commits into
mainfrom
fix/strengthen-research-pack-validator
May 11, 2026
Merged

fix: strengthen Research Pack validator — add Stop condition check, strip code blocks, detect empty sections#79
ShadyUnderLight merged 4 commits into
mainfrom
fix/strengthen-research-pack-validator

Conversation

@ShadyUnderLight
Copy link
Copy Markdown
Owner

@ShadyUnderLight ShadyUnderLight commented May 11, 2026

Summary

Strengthen scripts/validate_research_pack.py to catch structural issues that the previous simple string-contains check missed, reducing false confidence in Research Pack validation.

Changes

P1: Proper heading detection

  • Before: h in cleaned substring match — falsely accepted ### Stop condition, > ## Stop condition, ## Stop condition details
  • After: Exact H2 parsing via ^## title$ regex — only real H2 headings with exact title text count
  • Empty section bounds use any heading as boundary (not just required headings), sub-heading-only body correctly detected as empty

P2: Robust fenced code block stripping

  • Before: Single regex — didn't handle indented fences (up to 3 spaces), inline backticks could terminate early
  • After: Line-by-line state machine — opening matches ^[ ]{0,3}( + ```+|~~~`+`)`, closing requires same char, >= length of opening, trailing whitespace only

Added

  • ## Stop condition to REQUIRED_HEADINGS (was missing despite schema)
  • scripts/test_validator_regression.py — dedicated 8-test regression suite covering all edge cases
  • schemas/research-pack.md minimal example now includes ## Stop condition

Fixed

  • CI YAML: replaced inline python3 -c (broke YAML block scalar parsing when Python code was unindented) with dedicated test script

Out of scope (by design)

  • Source/citation cross-validation (format varies; belongs in checklists/source-traceability.md)
  • Audit status structure checks (too free-form for automated validation)

Closes #66

LMZ added 4 commits May 11, 2026 14:26
…trip code blocks, detect empty sections (#66)

- Added "## Stop condition" to REQUIRED_HEADINGS (was missing despite schema requiring it)
- Strip fenced code blocks (```, ~~~) before heading detection, so headings inside code blocks no longer falsely satisfy the check
- Detect empty required sections (heading with no non-whitespace content below it) and reject
- Added CI regression tests for all bad-sample scenarios
…ection, fence parser, empty section check (#66)

P1: Heading detection now uses exact H2 parsing (^## title$) instead of
substring containment. Rejects H3, blockquote, leading space, partial title
matches. Empty section bounds use any heading as boundary and exclude
sub-heading lines from body content.

P2: Fenced code block stripping uses line-by-line state machine.
Handles up-to-3-space indent, proper closing fence (same char, >= length,
trailing whitespace only). Inline backticks no longer terminate early.

CI regression tests use isolated single-mutation samples (valid baseline
+ one mutation per test) instead of ad-hoc partial packs.
…op condition to schema example (#66)

- CI YAML: replaced broken inline multi-line python3 -c with
  scripts/test_validator_regression.py (the inline code broke YAML
  block scalar parsing because Python code started at column 0)
- schemas/research-pack.md: added missing ## Stop condition to minimal
  example shape so it matches the schema's required sections
@ShadyUnderLight ShadyUnderLight merged commit 1baf8ef into main May 11, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

强化 Research Pack validator,避免结构校验产生假安全感

1 participant