fix: replace JsonlDataParse with JSONDataParse to match Sedna API#564
fix: replace JsonlDataParse with JSONDataParse to match Sedna API#56431groot wants to merge 1 commit into
Conversation
|
Welcome @31groot! It looks like this is your first PR to kubeedge/ianvs 🎉 |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: 31groot The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request simplifies dataset parsing in core/testenvmanager/dataset/dataset.py by removing JsonlDataParse and JSONMetaDataParse and replacing them with JSONDataParse for both JSONL and JSONFORLLM formats. The reviewer suggested an improvement to chain the sequential if statements checking data_format using elif to avoid redundant evaluations.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if data_format == DatasetFormat.JSONL.value: | ||
| data = JsonlDataParse(data_type=data_type, func=feature_process) | ||
| data = JSONDataParse(data_type=data_type, func=feature_process) | ||
| data.parse(file) | ||
|
|
||
| if data_format == DatasetFormat.JSONFORLLM.value: | ||
| data = JSONMetaDataParse(data_type=data_type, func=feature_process) | ||
| data = JSONDataParse(data_type=data_type, func=feature_process) | ||
| data.parse(file, **kwargs) |
There was a problem hiding this comment.
Since data_format can only match one of the DatasetFormat values, these sequential if statements should be chained using elif. This avoids redundant condition evaluations once a match has been found.
| if data_format == DatasetFormat.JSONL.value: | |
| data = JsonlDataParse(data_type=data_type, func=feature_process) | |
| data = JSONDataParse(data_type=data_type, func=feature_process) | |
| data.parse(file) | |
| if data_format == DatasetFormat.JSONFORLLM.value: | |
| data = JSONMetaDataParse(data_type=data_type, func=feature_process) | |
| data = JSONDataParse(data_type=data_type, func=feature_process) | |
| data.parse(file, **kwargs) | |
| elif data_format == DatasetFormat.JSONL.value: | |
| data = JSONDataParse(data_type=data_type, func=feature_process) | |
| data.parse(file) | |
| elif data_format == DatasetFormat.JSONFORLLM.value: | |
| data = JSONDataParse(data_type=data_type, func=feature_process) | |
| data.parse(file, **kwargs) |
261f95d to
465d4b9
Compare
a065f7e to
d5ef5cc
Compare
Signed-off-by: Parv Agrawal <agrawalparv13@gmail.com>
d5ef5cc to
4b21485
Compare
What type of PR is this?
/kind bug
What this PR does / why we need it:
dataset.pywas importingJsonlDataParseandJSONMetaDataParsefromsedna.datasources, but these names do not exist in current Sedna.Sedna renamed
JsonlDataParsetoJSONDataParseandJSONMetaDataParsewas never implemented. This caused an ImportError on every fresh install,
which blocks all examples from running.
Changes:
JsonlDataParsewithJSONDataParseJSONDataParseimportJSONMetaDataParsewithJSONDataParseas fallback for JSONFORLLM formatWhich issue(s) this PR fixes:
Fixes #563