feat(verify): CoT gate, tax/regulatory anchors, and citation data overhaul#19
Conversation
…table rows Tax and regulatory documents have a distinct anchoring pattern where dollar amounts, percentages, and named legal tests are the correct anchors — not the full qualifying clause. Adds worked examples, BAD/GOOD patterns, three failure table rows, and word-count gate examples for this document class. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Index and table-of-contents pages contain page-number references, not operative evidence. Citing them produces garbage sourceContext and fails verification. Adds a rule with BAD/GOOD examples to steer citations toward body text pages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… symmetry Addresses two review suggestions: 1. Explain the trailing-dash artifact on "$450-" so the anchor stripping is intentional, not accidental. 2. Add prose-placement examples to all three tax/regulatory failure table rows (previously only the "30%" row showed how to rewrite the surrounding sentence). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move citation anchor rules, display label spec, and QA grading rubric from private mono-repo into the skills repo so all SKILL.md references resolve for standalone skills-repo users. Update verify/SKILL.md to reference docs/deep-citation-standards.md (was docs/agents/deep-citation-standards.md in private mono-repo). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The canonical citation format spec now lives in DeepCitation/deepcitation at docs/agents/deep-citation-standards.md. Update SKILL.md references from docs/deep-citation-standards.md to packages/deepcitation/docs/agents/deep-citation-standards.md (the submodule path when used from the mono-repo). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ts.md Also upgrade path to full submodule path for mono-repo resolution. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…urceMatch Forces locate-then-extract reasoning: write `f` (sourceContext) first, then derive `k` (sourceMatch) as a substring. Rewrites citation field docs as a CoT-ordered table; collapses Format 1/2 into a single bold-anchor format; adds f→k hard substring rule to eliminate paraphrase at the source. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Overall this is a well-reasoned set of improvements. The CoT gate is a genuinely clever fix for paraphrase failures, and the Domain A/B framing + Format 1/Format 2 split adds expressiveness without much cognitive overhead. A few things worth flagging:
The old example used
CDN dependencies with unpinned versions: the file loads Should this be committed at all? Minor: "find the sentence in the evidence that proves the claim and hold it in mind as What is working well
Summary: Resolve the |
- Fix: SELF-CHECK step 0 — rename `sourceContext` → `f` (`source_context`) to match the actual data field name; avoids collision with the UI component of the same name - Fix: add `scratch/` to .gitignore and untrack verify-flow.html; scratch files are ephemeral and don't belong in version control Note: `p` field format (`page_number_N_index_I`) is unchanged from the previous SKILL.md version — the diff only moved the docs from a bullet list to a table. No CLI compatibility concern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Thanks for the detailed review. Addressing each point:
|
|
@claude review the latest changes |
Summary
f) before pickingsourceMatch(k) — eliminates paraphrase failures by makingf→ka visible substring check, not a post-hoc rationalizationn,r,f,k,p,l) — field order now communicates the required reasoning orderf→ksubstring rule: added as a standalone callout in both the data-block section and the sub-agent instructions; if the plannedkdoesn't appear word-for-word inf, the model must fixffirstdeep-citation-standards.mdreference to the deepcitation repo; updated concepts doc referenceTest plan
/verifyon a document with numeric facts (dollar amounts, dates) and confirm citations use terse anchors (≤4 words) that are verbatim substrings off/verifyon a tax/regulatory document and confirm percentages/thresholds are cited with the exact figure askf→ksubstring checkn,r,f,k,p,lorder in generated output