Skip to content

fix: replace JsonlDataParse with JSONDataParse to match Sedna API#564

Open
31groot wants to merge 1 commit into
kubeedge:mainfrom
31groot:fix/dataset-api-rename-563
Open

fix: replace JsonlDataParse with JSONDataParse to match Sedna API#564
31groot wants to merge 1 commit into
kubeedge:mainfrom
31groot:fix/dataset-api-rename-563

Conversation

@31groot

@31groot 31groot commented Jun 23, 2026

Copy link
Copy Markdown

What type of PR is this?
/kind bug

What this PR does / why we need it:
dataset.py was importing JsonlDataParse and JSONMetaDataParse from
sedna.datasources, but these names do not exist in current Sedna.
Sedna renamed JsonlDataParse to JSONDataParse and JSONMetaDataParse
was never implemented. This caused an ImportError on every fresh install,
which blocks all examples from running.

Changes:

  • Replace JsonlDataParse with JSONDataParse
  • Remove duplicate JSONDataParse import
  • Replace JSONMetaDataParse with JSONDataParse as fallback for JSONFORLLM format

Which issue(s) this PR fixes:
Fixes #563

@kubeedge-bot kubeedge-bot added kind/bug Categorizes issue or PR as related to a bug. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. labels Jun 23, 2026
@kubeedge-bot kubeedge-bot requested review from Poorunga and hsj576 June 23, 2026 20:10
@kubeedge-bot

Copy link
Copy Markdown
Collaborator

Welcome @31groot! It looks like this is your first PR to kubeedge/ianvs 🎉

@kubeedge-bot

Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: 31groot
To complete the pull request process, please assign jaypume after the PR has been reviewed.
You can assign the PR to them by writing /assign @jaypume in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubeedge-bot kubeedge-bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jun 23, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request simplifies dataset parsing in core/testenvmanager/dataset/dataset.py by removing JsonlDataParse and JSONMetaDataParse and replacing them with JSONDataParse for both JSONL and JSONFORLLM formats. The reviewer suggested an improvement to chain the sequential if statements checking data_format using elif to avoid redundant evaluations.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread core/testenvmanager/dataset/dataset.py Outdated
Comment on lines 584 to 590
if data_format == DatasetFormat.JSONL.value:
data = JsonlDataParse(data_type=data_type, func=feature_process)
data = JSONDataParse(data_type=data_type, func=feature_process)
data.parse(file)

if data_format == DatasetFormat.JSONFORLLM.value:
data = JSONMetaDataParse(data_type=data_type, func=feature_process)
data = JSONDataParse(data_type=data_type, func=feature_process)
data.parse(file, **kwargs)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since data_format can only match one of the DatasetFormat values, these sequential if statements should be chained using elif. This avoids redundant condition evaluations once a match has been found.

Suggested change
if data_format == DatasetFormat.JSONL.value:
data = JsonlDataParse(data_type=data_type, func=feature_process)
data = JSONDataParse(data_type=data_type, func=feature_process)
data.parse(file)
if data_format == DatasetFormat.JSONFORLLM.value:
data = JSONMetaDataParse(data_type=data_type, func=feature_process)
data = JSONDataParse(data_type=data_type, func=feature_process)
data.parse(file, **kwargs)
elif data_format == DatasetFormat.JSONL.value:
data = JSONDataParse(data_type=data_type, func=feature_process)
data.parse(file)
elif data_format == DatasetFormat.JSONFORLLM.value:
data = JSONDataParse(data_type=data_type, func=feature_process)
data.parse(file, **kwargs)

@31groot 31groot force-pushed the fix/dataset-api-rename-563 branch from 261f95d to 465d4b9 Compare June 23, 2026 20:16
@kubeedge-bot kubeedge-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 23, 2026
@31groot 31groot force-pushed the fix/dataset-api-rename-563 branch 2 times, most recently from a065f7e to d5ef5cc Compare June 23, 2026 20:30
Signed-off-by: Parv Agrawal <agrawalparv13@gmail.com>
@31groot 31groot force-pushed the fix/dataset-api-rename-563 branch from d5ef5cc to 4b21485 Compare June 23, 2026 20:48
@kubeedge-bot kubeedge-bot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Core setup failures on fresh install: API rename, missing class, wrong docs path, undocumented deps

2 participants