Skip to content

fix(conformance): enforce SC-12 on verbose claims + validate dense date#21

Merged
tymofiy merged 1 commit into
mainfrom
fix/sc12-verbose-claims-dense-date
Jun 1, 2026
Merged

fix(conformance): enforce SC-12 on verbose claims + validate dense date#21
tymofiy merged 1 commit into
mainfrom
fix/sc12-verbose-claims-dense-date

Conversation

@tymofiy
Copy link
Copy Markdown
Owner

@tymofiy tymofiy commented May 30, 2026

Lands the SC-12 conformance fix from the 2026-05-29 ecosystem audit-sweep, split onto a branch because main is protected (direct push declined during pre-deploy reconcile).

Change (ac6b24d): enforce SC-12 on verbose claims + validate dense date — part of the audit-sweep batch, build-verified.

Every other repo's audit-sweep work is already on main; this is the one commit that needs the PR flow.

🤖 Generated with Claude Code

Audit sweep 2026-05-29 (batch 2). Findings V7.

- P2: the verbose-form branch never enforced SC-12 (a prediction-nature claim
  must keep confidence ≤0.95) — only the dense branch did. Added the mirrored
  check (parse `nature:`/`confidence:` from the verbose metadata) + a new
  invalid fixture `verbose-prediction-too-confident.kpack` so the constraint is
  exercised on both metadata forms.
- P3 (date): the dense branch skipped validating position 4 (date). Now
  validates the ISODate format (YYYY-MM-DD) when the slot is non-empty, per
  kp-claims.peg.
- P3 (depth) NOT enforced — deliberately. The spec lists depth as
  assumed/investigated/exhaustive, but a large share of the shipped corpus
  (system + grounding packs) uses extra values like `practitioner`/`confirmed`.
  Enforcing the closed set flagged hundreds of production claims (and even the
  published "gold" packs), so it is a spec-vs-corpus reconciliation, not a
  parser quick-win. Left as a documented no-op with a dedicated comment.

Verify: python3 conformance/run.py → 20/20 (all valid PASS, all invalid FAIL
with expected codes incl. the new verbose SC-12 case).

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 30, 2026 04:11
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the KP:1 conformance runner to close a validation gap in SC-12 enforcement for verbose claim metadata, and adds a new invalid fixture to exercise that path.

Changes:

  • Add an invalid fixture (verbose-prediction-too-confident.kpack) expected to fail SC-12.
  • Enforce SC-12 for verbose (named-field) claim metadata by checking nature: prediction + high confidence.
  • Add dense-claim date slot format validation in the runner.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
conformance/run.py Adds SC-12 enforcement for verbose metadata and adds dense date-slot validation logic.
conformance/fixtures/invalid/verbose-prediction-too-confident.kpack/PACK.yaml New invalid fixture pack metadata for the verbose SC-12 test case.
conformance/fixtures/invalid/verbose-prediction-too-confident.kpack/evidence.md Evidence file for the new invalid verbose SC-12 fixture.
conformance/fixtures/invalid/verbose-prediction-too-confident.kpack/claims.md Defines verbose claims where one exceeds the SC-12 prediction confidence cap.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread conformance/run.py
Comment on lines +315 to +319
# Position 4 (date): ISODate. Grammar is YYYY-MM-DD; validate the
# format when the slot is non-empty (empty interior slots are valid).
date_s = parts[3].strip()
if date_s and not re.fullmatch(r"\d{4}-\d{2}-\d{2}", date_s):
errs.append(Err("parse", f"invalid date '{date_s}' for {cid} (expected YYYY-MM-DD)"))
@tymofiy tymofiy merged commit ac6b24d into main Jun 1, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants