Conversation
- 补充 WASM 编译结果在前端 JS 中的用法
frankslin
added a commit
that referenced
this pull request
Jan 1, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 3, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
pushed a commit
that referenced
this pull request
Jan 9, 2026
This commit addresses two severe security vulnerabilities discovered in OpenCC's UTF-8 text processing logic. ## Vulnerability 1: MaxMatchSegmentation Buffer Overflow (Issue BYVoid#997) **Location:** src/MaxMatchSegmentation.cpp **Type:** Heap buffer overflow via integer underflow **CVSS:** ~7.5 (High) Problem: - Used manual length decrement: length -= matchedLength - When UTF-8 character length exceeded remaining bytes, caused size_t underflow - Next MatchPrefix() call received huge length value, reading beyond buffer Example trigger: - Input: "一" + \xE4\xB8 (truncated 3-byte sequence) - Iteration 2: remainingLength=2, NextCharLength=3 - Old: length = 2 - 3 = SIZE_MAX (underflow) - Result: Buffer overflow read ## Vulnerability 2: Conversion Information Disclosure (More Severe) **Location:** src/Conversion.cpp **Type:** Information disclosure + heap buffer overflow **CVSS:** ~8.6 (Critical) Problem: - Similar to #1, but worse: OUTPUTS leaked data to result - When processing truncated UTF-8, would jump over null terminator - Continue reading and OUTPUT heap memory contents - Could leak: encryption keys, passwords, user data, etc. Example exploit: - Input: "干" + \xE5\xB9 + null - Output: "幹" + heap_garbage_data - Attacker receives sensitive information directly Why more severe than #1: - Issue BYVoid#997: Buffer overflow (no data output) - This bug: Buffer overflow + data exfiltration - Direct information disclosure to attacker ## Solution Implemented defense-in-depth approach with multiple layers: 1. **Layer 1 - Dynamic length calculation:** ```cpp const char* textEnd = text.c_str() + text.length(); size_t remainingLength = textEnd - pstr; // Always accurate ``` 2. **Layer 2 - Explicit boundary checks:** ```cpp if (matchedLength > remainingLength) { matchedLength = remainingLength; // Clamp to safe value } ``` 3. **Layer 3 - Loop termination:** - Existing `*pstr != '\0'` check as final safeguard 4. **Layer 4 - Dictionary match validation:** - Also validate KeyLength() doesn't exceed remainingLength - Defense even against corrupted dictionary data ## Changes **Code fixes:** - src/MaxMatchSegmentation.cpp: * Calculate textEnd pointer once * Dynamically compute remainingLength per iteration * Add explicit bounds check for NextCharLength result * Pass remainingLength to MatchPrefix - src/Conversion.cpp: * Calculate phraseEnd pointer once * Dynamically compute remainingLength per iteration * Add bounds checks for both NextCharLength and KeyLength * Prevent reading beyond null terminator **Test coverage:** - src/MaxMatchSegmentationTest.cpp: * Add TruncatedUtf8Sequence test * Verifies handling of incomplete UTF-8 sequences * Ensures output preserves all input bytes (no data loss) - src/ConversionTest.cpp: * Add TruncatedUtf8Sequence test * Verifies conversion works + no information leak * Tests with "干" → "幹" + preserved incomplete sequence ## Behavior Verification **Normal input:** Behavior completely unchanged - Old: length values 9→6→3→0 - New: remainingLength values 9→6→3→0 - Boundary checks never trigger - Zero performance impact **Malicious input:** Now safely handled - Incomplete UTF-8 sequences preserved (no data loss) - No buffer overruns - No information disclosure - All tests pass (15/15) ## Security Impact - Fixes CWE-125 (Out-of-bounds Read) - Fixes CWE-200 (Information Exposure) - Prevents DoS attacks - Prevents information disclosure attacks - Backward compatible with all normal use cases Discovered during security audit. All users should upgrade immediately if processing untrusted input. Fixes BYVoid#997
frankslin
pushed a commit
that referenced
this pull request
Jan 9, 2026
This commit addresses two severe security vulnerabilities discovered in OpenCC's UTF-8 text processing logic. ## Vulnerability 1: MaxMatchSegmentation Buffer Overflow (Issue BYVoid#997) **Location:** src/MaxMatchSegmentation.cpp **Type:** Heap buffer overflow via integer underflow **CVSS:** ~7.5 (High) Problem: - Used manual length decrement: length -= matchedLength - When UTF-8 character length exceeded remaining bytes, caused size_t underflow - Next MatchPrefix() call received huge length value, reading beyond buffer Example trigger: - Input: "一" + \xE4\xB8 (truncated 3-byte sequence) - Iteration 2: remainingLength=2, NextCharLength=3 - Old: length = 2 - 3 = SIZE_MAX (underflow) - Result: Buffer overflow read ## Vulnerability 2: Conversion Information Disclosure (More Severe) **Location:** src/Conversion.cpp **Type:** Information disclosure + heap buffer overflow **CVSS:** ~8.6 (Critical) Problem: - Similar to #1, but worse: OUTPUTS leaked data to result - When processing truncated UTF-8, would jump over null terminator - Continue reading and OUTPUT heap memory contents - Could leak: encryption keys, passwords, user data, etc. Example exploit: - Input: "干" + \xE5\xB9 + null - Output: "幹" + heap_garbage_data - Attacker receives sensitive information directly Why more severe than #1: - Issue BYVoid#997: Buffer overflow (no data output) - This bug: Buffer overflow + data exfiltration - Direct information disclosure to attacker ## Solution Implemented defense-in-depth approach with multiple layers: 1. **Layer 1 - Dynamic length calculation:** ```cpp const char* textEnd = text.c_str() + text.length(); size_t remainingLength = textEnd - pstr; // Always accurate ``` 2. **Layer 2 - Explicit boundary checks:** ```cpp if (matchedLength > remainingLength) { matchedLength = remainingLength; // Clamp to safe value } ``` 3. **Layer 3 - Loop termination:** - Existing `*pstr != '\0'` check as final safeguard 4. **Layer 4 - Dictionary match validation:** - Also validate KeyLength() doesn't exceed remainingLength - Defense even against corrupted dictionary data ## Changes **Code fixes:** - src/MaxMatchSegmentation.cpp: * Calculate textEnd pointer once * Dynamically compute remainingLength per iteration * Add explicit bounds check for NextCharLength result * Pass remainingLength to MatchPrefix - src/Conversion.cpp: * Calculate phraseEnd pointer once * Dynamically compute remainingLength per iteration * Add bounds checks for both NextCharLength and KeyLength * Prevent reading beyond null terminator **Test coverage:** - src/MaxMatchSegmentationTest.cpp: * Add TruncatedUtf8Sequence test * Verifies handling of incomplete UTF-8 sequences * Ensures output preserves all input bytes (no data loss) - src/ConversionTest.cpp: * Add TruncatedUtf8Sequence test * Verifies conversion works + no information leak * Tests with "干" → "幹" + preserved incomplete sequence ## Behavior Verification **Normal input:** Behavior completely unchanged - Old: length values 9→6→3→0 - New: remainingLength values 9→6→3→0 - Boundary checks never trigger - Zero performance impact **Malicious input:** Now safely handled - Incomplete UTF-8 sequences preserved (no data loss) - No buffer overruns - No information disclosure - All tests pass (15/15) ## Security Impact - Fixes CWE-125 (Out-of-bounds Read) - Fixes CWE-200 (Information Exposure) - Prevents DoS attacks - Prevents information disclosure attacks - Backward compatible with all normal use cases Discovered during security audit. All users should upgrade immediately if processing untrusted input. Fixes BYVoid#997
frankslin
added a commit
that referenced
this pull request
Jan 13, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 14, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 14, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 14, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 14, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 14, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 16, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 21, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 24, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Jan 28, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Mar 9, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Mar 17, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
frankslin
added a commit
that referenced
this pull request
Mar 18, 2026
* Add WASM demo scaffold and project notes * Add OpenCC WASM demo with converter UI and test runner - 补充 WASM 编译结果在前端 JS 中的用法 * Polish WASM demo UI and paths, run tests, and streamline converter export * Add wasm-based OpenCC package and update demo to consume it * Add wasm-based OpenCC package, static demo bundle, and benchmarking page * Add copyright notice and LICENSE
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Source code for https://www.npmjs.com/package/opencc-wasm