Skip to content

fix bug#24

Merged
hejujie merged 2 commits into
mainfrom
fix_bug_code_data_filter
Apr 25, 2025
Merged

fix bug#24
hejujie merged 2 commits into
mainfrom
fix_bug_code_data_filter

Conversation

@yanrui27
Copy link
Copy Markdown
Collaborator

def process_ground_truth(item):
if "reward_model" in item and "ground_truth" in item["reward_model"]:
try:
item["reward_model"]["ground_truth"] = json.loads(item["reward_model"]["ground_truth"])
except:
pass
return item
dataset= dataset.map(process_ground_truth)

During data processing, the use of map unintentionally introduced extra fields into the ground_truth field of the code data. These additional fields caused incorrect code sandbox selection, leading to unexpected bugs.

Before fix bug
image

After fix bug
image

During data processing, the use of map unintentionally introduced extra fields into the ground_truth field of the code data. These additional fields caused incorrect code sandbox selection, leading to unexpected bugs.
Comment thread or1_scripts/data_preprocess/download_and_filter_data_1p5b.py Outdated
@hejujie hejujie merged commit dd69074 into main Apr 25, 2025
0 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants