feature: New bazel test for config and dictionaries#9
Merged
Conversation
- Switch all tests (C++ CLI, Python, Node) to consume `testcases.json` and drop `.in`/`.ans` dependencies; keep filegroup for the JSON. - Prune TWPhrases sub-dictionary artifacts and align DictionaryTest to current generated dict set. - Add rapidjson dep/path for CLI test, refresh_assets script fixes, and keep Bazel Python toolchain note.
There was a problem hiding this comment.
Pull request overview
This PR modernizes the test infrastructure by consolidating all test cases into a single testcases.json file, replacing the legacy .in/.ans file pairs. The change improves maintainability by establishing a single source of truth for test data across CLI, Python, and Node.js test suites.
Key Changes:
- Unified test data format using
testcases.jsonwith structured test cases containing input, expected outputs per configuration, and unique IDs - Updated all test runners (C++, Python, Node.js) to consume JSON-based test cases instead of file-based inputs
- Streamlined dictionary build to exclude intermediate phrase components from standalone binary outputs
Reviewed changes
Copilot reviewed 37 out of 37 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/testcases/testcases.json | New consolidated JSON file containing all test cases with inputs and expected outputs per config |
| test/testcases/.in, test/testcases/.ans | Removed legacy test input/answer files (migrated to JSON) |
| test/testcases/BUILD.bazel | Updated to reference testcases.json instead of globbing .in/.ans files |
| test/CommandLineConvertTest.cpp | Refactored to parse testcases.json and dynamically generate test cases |
| test/BUILD.bazel | Added rapidjson dependency for JSON parsing |
| test/CMakeLists.txt | Added rapidjson include path for test compilation |
| python/tests/test_opencc.py | Migrated from glob-based file reading to JSON-based test iteration |
| node/test.js | Refactored to read testcases.json and iterate over cases instead of hardcoded config list |
| data/dictionary/DictionaryTest.cpp | Updated dictionary list to exclude removed TWPhrases component files |
| data/dictionary/BUILD.bazel | Added PHRASE_PARTS exclusion to prevent standalone .ocd2 generation for merge components |
| data/config/ConfigDictValidationTest.cpp | New end-to-end validation test for configs against testcases.json |
| data/config/BUILD.bazel | Added cc_test target for config validation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Rename and guard streams in CommandLineConvertTest; ensure input file opens and normalize CRLF. - Fix node test promise handling to propagate errors correctly. - Mark ConfigDictValidationTest as Bazel-only to skip CMake builds.
frankslin
added a commit
that referenced
this pull request
Jan 3, 2026
## Summary
- add a `//data/config:config_dict_validation_test` to test dictionaries and configs against a `testcases.json` file
- switch all CLI/Python/Node tests to consume `testcases.json` as the single source of truth; drop `.in/.ans` dependencies and adjust Bazel/CMake wiring
- streamline dictionary build outputs (no standalone `TWPhrases{IT,Name,Other}.ocd2`) and align DictionaryTest with the actual generated dict set
- add maintenance helpers (refresh_assets.sh cleanup and fix, rapidjson dep/path for CLI test) and keep wasm assets in sync via `testcases.json`
## Testing
- bazel test //data/dictionary:dictionary_test
- bazel test //test:command_line_converter_test
- bazel test //python/tests:test_opencc
- node/test.js (sync/async/promise) using updated testcases.json
----
* feature: add a new ConfigDictValidationTest.cpp to be executed in bazel
* Changeover to JSON-based testcases and clean dictionary outputs
- Switch all tests (C++ CLI, Python, Node) to consume `testcases.json` and drop `.in`/`.ans` dependencies; keep filegroup for the JSON.
- Prune TWPhrases sub-dictionary artifacts and align DictionaryTest to current generated dict set.
- Add rapidjson dep/path for CLI test, refresh_assets script fixes, and keep Bazel Python toolchain note.
* Normalize CommandLineConvertTest for CRLF comparisons on Windows
* Address review feedback for tests and Bazel-only validation
- Rename and guard streams in CommandLineConvertTest; ensure input file opens and normalize CRLF.
- Fix node test promise handling to propagate errors correctly.
- Mark ConfigDictValidationTest as Bazel-only to skip CMake builds.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
//data/config:config_dict_validation_testto test dictionaries and configs against atestcases.jsonfiletestcases.jsonas the single source of truth; drop.in/.ansdependencies and adjust Bazel/CMake wiringTWPhrases{IT,Name,Other}.ocd2) and align DictionaryTest with the actual generated dict settestcases.jsonTesting