Skip to content

Add WASM detector observability and fix Latin hint routing#53

Merged
dev-pi2pie merged 23 commits into
mainfrom
dev
Mar 24, 2026
Merged

Add WASM detector observability and fix Latin hint routing#53
dev-pi2pie merged 23 commits into
mainfrom
dev

Conversation

@dev-pi2pie

Copy link
Copy Markdown
Owner

Summary

This PR adds structured WASM detector observability across single-input and batch CLI flows, improves JSON/debug reporting, and fixes Latin hint handling/regressions in WASM mode.

Fixes #52

What changed

Detector observability

  • added a structured detector debug model with summary counters and optional per-window evidence payloads
  • added --detector-evidence for WASM runs, with compact and verbose debug output modes
  • added a stable debug event envelope with:
    • schemaVersion
    • timestamp
    • runId
    • topic
    • scope
    • verbosity
  • added detector evidence/debug report naming for autogenerated report files

CLI and batch integration

  • threaded detector debug context through:
    • single-input runtime
    • async batch executor
    • worker-pool batch executor
  • forwarded worker detector debug events back to the parent process
  • added detector summaries to JSON output when --debug is enabled
  • kept detector events file-scoped in batch mode and gated detector evidence behind --debug and --detector wasm

WASM detector quality and hint ordering

  • deferred Latin hint application until after WASM detector routing so detector-derived locales are not overridden by a Latin tag hint
  • preserved built-in and custom Latin hint subspans inside accepted WASM windows
  • fixed hard-wrapped prose false negatives in the Latin quality gate by evaluating contiguous prose blocks instead of isolated physical lines
  • preserved fallback relabeling behavior for unresolved Latin detector windows

Docs and contracts

  • updated README, JSON output contract, and debug event stream contract
  • added/updated plan, research, and job records covering:
    • detector evidence design
    • global debug observability
    • WASM Latin hint ordering
    • follow-up regression fixes
  • bumped embedded/package version to 0.1.5-canary.3

Why

This branch improves visibility into WASM detector decisions without changing default output behavior for normal users, and it resolves correctness issues around Latin hint precedence and wrapped-prose detection.

Verification

  • bun test test/word-counter.test.ts
  • bun test test/command.test.ts

Notes

  • debug JSON remains opt-in via --debug
  • detector evidence remains opt-in via --detector-evidence
  • batch worker and non-worker execution paths now have matching detector debug behavior

… add global debug observability model documentation
…w filename formats and regression coverage details
…hema versioning details and approved fixture matrix for token-quality gate research
@dev-pi2pie dev-pi2pie self-assigned this Mar 24, 2026
@dev-pi2pie dev-pi2pie added documentation Improvements or additions to documentation enhancement New feature or request labels Mar 24, 2026
@dev-pi2pie dev-pi2pie merged commit dd4fb64 into main Mar 24, 2026
1 check passed
@dev-pi2pie dev-pi2pie deleted the dev branch March 24, 2026 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WASM detector skips ambiguous Latin remap when --latin-tag is set and changes collector totals

1 participant