fix(walker+ui): verb stops + CN trailing strips + highlight bugs (batch r2)#181
Merged
Conversation
… bugs Batch round 2 (2026-06-01). Five autoship-able issues + two UI bugs that surfaced from the same user screenshot — fix proper. US walker R9 — _STOP_WORDS regex extension: - Added `move|complies` to the finite-verb stop list. Mirror of TW R10 anti-N+V over-capture mechanism. Cross-jurisdiction parity: no CN report of analogous over-capture for these specific verbs yet — deferred per DR-1 until empirical evidence surfaces. Both verbs are unambiguous finite verbs in patent claim diction; same conceptual class as R7's exceeds|stays + R8's occur|approach. 2 corpus labels silenced (both walker_fp, no protect_violations), dual-labeled with resolved_by marker. Closes #178, #180. CN spec-support — trailing-token extension: - _CN_SPEC_SUPPORT_TRAILING_TOKENS += `抵靠|穿设|穿過|分别穿过`. Locative + perforation verbs that the intro extractor was capturing as part of the noun phrase. Drafter-grade verb diction in CNIPA- registered specs. Closes #174, #175, #176. Frontend AntecedentBasisCard — two highlight bugs: - CN: no terms were highlighting at all because CJK_REF_PREFIXES only contained Traditional Chinese variants (該等 / 該些 / 該). Simplified Chinese (该等 / 该些 / 该) fell through to the English-wrap regex branch which never matches CJK text. Added Simplified variants. - US: longer findings were only highlighting their prefix because JavaScript regex alternation is leftmost-match, not longest. For two findings on the same claim where one term is a prefix of the other, the shorter alternative wins. Sort all three alternative arrays (cjkParts / refFormParts / bareParts) by length DESCENDING before building the combined pattern. Same bug class applied symmetrically to CJK alternation. Gates: - pytest: 2704 passed / 11 skipped - US harness: 0 unresolved_new / 0 unresolved_removed / 0 protect_violations - CN harness: clean (no silences in this batch) - Wheel rebuilt
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
kwisschen
added a commit
that referenced
this pull request
Jun 1, 2026
Two complementary cleanups discovered when auditing whether walker-round discipline was being applied properly: 1. round_history backfill (PR #182 rule additions) - CN R38 `r38_locative_adjective_trailing_strip` — adds 相邻 to _TRAILING_VERB_DENYLIST_CN. Real user signal: issue #171. Cross-juris: TW already covers via existing trailing-strip pipeline. - US R10 `r10_through_hole_compound_np_synthesis` — adds bare-cardinal + through hole(s) compound-NP post-extension. Real user signal: issue #178 finding 1. Cross-juris: deferred per DR-1 (CJK 通孔 is a single token). Both marked fixtures_silenced: 0 honestly (rule additions only — corpus didn't exercise these patterns). 2. TW spec_support parity (mirror of PR #181 CN additions) - _TW_SPEC_SUPPORT_TRAILING_TOKENS += 抵靠 / 穿設 / 穿過 / 分別穿過 - PR #181 added the SC variants to CN from real CN reports (#174 / #175 / #176). All four are Traditional-compatible perforation / abutment verbs. Generalized per the user's standing instruction to mirror fixes across applicable jurisdictions. Discipline lesson: autoship walker fixes that don't silence corpus labels still need round_history entries — the cross-jurisdiction discipline pytest (tests/test_cross_jurisdiction_discipline.py) audits round_history for parity decisions. Updated triage-report SKILL.md (local) to enforce this going forward. Gates: - pytest: 2704 passed / 11 skipped - cross-jurisdiction discipline pytest: passes - US harness: 0 / 0 / 0 - CN harness: 0 / 0 / 0
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Batch round 2 — five autoship issues + two UI bugs
Walker fixes
UI fixes (same file: `AntecedentBasisCard.jsx`)
CN highlighting was completely absent. Root cause: `CJK_REF_PREFIXES` only contained Traditional Chinese variants (該等 / 該些 / 該). Simplified Chinese terms (该等 / 该等 / 该) fell through to the English-wrap regex branch (`(?:the|said)\s+...`) which never matches CJK text. Fix: add Simplified variants.
US longer findings only highlighted their prefix. Root cause: JavaScript regex alternation is leftmost-match, not longest-match. For a draft with two findings — `the two` and `the two clamping members respectively move` — the shorter alternative wins the alternation race even when the longer one would consume more text. Fix: sort all three alternative arrays (cjkParts / refFormParts / bareParts) by length DESCENDING before building the combined pattern.
Gates
Discipline
Closes #174
Closes #175
Closes #176
Closes #178
Closes #180