Reusing `extract_answer_loose` in `accuracy_reward` by jamesbraza · Pull Request #7 · Future-House/ether0

jamesbraza · 2025-06-06T21:50:41Z

#6 forgot to use test arg, and when adding it just now, I realized:

accuracy_reward should reuse extract_answer_loose for test
- This was due to the organic nature of our internal code base as many scripts, arising without a DRY'ing out
The baselines demo could use accuracy_reward directly

Copilot

Pull Request Overview

This PR refactors how the test flag is handled in reward extraction by reusing extract_answer_loose in the accuracy_reward function and updates the README to demonstrate its usage.

Replaces duplicated logic with a concise ternary expression in rewards.py.
Updates the README to leverage the new accuracy_reward function for baseline demos.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
src/ether0/rewards.py	Refactors answer extraction to use a ternary expression based on test.
README.md	Removes old extraction logic and shows usage of accuracy_reward.

Comments suppressed due to low confidence (2)

src/ether0/rewards.py:706

[nitpick] Consider adding parentheses around the ternary expression to improve readability and clarify the execution order.

extract_answer_loose(content) if test else extract_thought_answer_strict(content, reasoning=reasoning)[1]

README.md:191

[nitpick] Consider adding a brief comment explaining the use of strict=True in zip to clarify intent for future maintainers.

for prob_type, reward in zip(test_ds["problem_type"], rewards, strict=True):

…nce of accuracy_reward

jamesbraza added 2 commits June 6, 2025 14:37

Reusing extract_answer_loose in accuracy_reward

f67b432

Moved README baselines example to use accuracy_reward with test flag

e86f8f5

jamesbraza requested review from albertbou92, geemi725, maykcaldas, sidnarayanan and whitead June 6, 2025 21:50

jamesbraza self-assigned this Jun 6, 2025

jamesbraza added the enhancement New feature or request label Jun 6, 2025

Copilot AI review requested due to automatic review settings June 6, 2025 21:50

Copilot AI reviewed Jun 6, 2025

View reviewed changes

Ryan-Rhys reviewed Jun 6, 2025

View reviewed changes

Comment thread README.md Outdated

Ryan-Rhys approved these changes Jun 6, 2025

View reviewed changes

Reverted back to original example, with comment mentioning the existe…

a5e140d

…nce of accuracy_reward

jamesbraza merged commit 97042fd into main Jun 6, 2025
3 checks passed

jamesbraza deleted the test-in-baseline branch June 6, 2025 22:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reusing `extract_answer_loose` in `accuracy_reward`#7

Reusing `extract_answer_loose` in `accuracy_reward`#7
jamesbraza merged 3 commits intomainfrom
test-in-baseline

jamesbraza commented Jun 6, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jamesbraza commented Jun 6, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants