Skip to content

feat: add WASM-backed detector path and consolidate release packaging#50

Merged
dev-pi2pie merged 17 commits into
mainfrom
dev
Mar 23, 2026
Merged

feat: add WASM-backed detector path and consolidate release packaging#50
dev-pi2pie merged 17 commits into
mainfrom
dev

Conversation

@dev-pi2pie

Copy link
Copy Markdown
Owner

Summary

This PR introduces an optional WASM-backed language detector flow for ambiguous Latin and Han text, exposes a new detector-specific package subpath, and consolidates CI/release packaging around a verified build artifact.

What Changed

  • Added a new detector API under @dev-pi2pie/word-counter/detector
  • Implemented async detector-aware counting and segmentation entrypoints
  • Added a Rust whatlang-based WASM detector crate and build pipeline
  • Routed ambiguous und-Latn / und-Hani chunks through the WASM detector when enabled
  • Added detector policies, remapping, fallback handling, and section-count support
  • Updated CLI/runtime options to support --detector regex|wasm
  • Expanded README and package metadata for the new detector entrypoint
  • Added package-content verification for published artifacts
  • Consolidated release workflows into a prepare + publish pipeline with shared artifacts
  • Added CI coverage for build, package verification, detector interop, and package type entrypoints
  • Fixed Windows portability issues in the WASM build helper and package typecheck test

Why

  • Improves language resolution for ambiguous script-only chunks without changing default behavior
  • Keeps the default regex path intact while allowing a higher-fidelity detector mode
  • Reduces release workflow duplication and verifies that published artifacts are actually complete
  • Improves cross-platform reliability for contributors and CI environments

API / Behavior Notes

  • Default detector mode remains regex
  • --detector wasm is opt-in
  • The detector subpath exposes async APIs such as wordCounterWithDetector
  • WASM runtime unavailability is surfaced with an explicit error message
  • Short or low-confidence ambiguous chunks still fall back to und-*

Testing

  • bun test
  • bun run build:wasm
  • package type and interop coverage for root and detector entrypoints
  • package-content verification for release artifacts

Risks / Follow-ups

  • The WASM detector path depends on Rust + wasm-pack during build
  • Release/CI now relies on the prepared artifact flow, so future package layout changes should keep verify:package-contents updated

@dev-pi2pie dev-pi2pie added documentation Improvements or additions to documentation enhancement New feature or request labels Mar 23, 2026
@dev-pi2pie dev-pi2pie self-assigned this Mar 23, 2026
@dev-pi2pie dev-pi2pie merged commit 3ec1a51 into main Mar 23, 2026
0 of 2 checks passed
@dev-pi2pie dev-pi2pie deleted the dev branch March 23, 2026 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant