refactor: remove redundant fields from TaskSolution and ValidationResult by kohankhaki · Pull Request #50 · VectorInstitute/automated_capability_evaluation

kohankhaki · 2026-01-21T08:37:58Z

PR Type

Fix

Short Description

Remove redundant task_id and task string fields from TaskSolution and ValidationResult dataclasses. These fields duplicated information already available in the nested objects (task_obj and task_solution). Added convenience properties to maintain backward compatibility for accessing these values. Also clarified documentation for the reasoning field to distinguish it from Bloom's taxonomy level.

Tests Added

None

This change is

…ValidationResult

…tency

Copilot

Pull request overview

This PR removes redundant data duplication in TaskSolution and ValidationResult dataclasses by eliminating direct task_id and task string fields that duplicated information from nested objects. The refactor maintains backward compatibility through convenience properties and updates documentation to clarify field purposes.

Changes:

Removed redundant task_id and task fields from TaskSolution, keeping only the task object (renamed from task_obj)
Removed redundant task_id and task fields from ValidationResult, relying on the nested task_solution object
Added convenience properties (task_id and task_text) to both classes for backward-compatible access

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/schemas/solution_schemas.py	Refactored `TaskSolution` to remove redundant fields and rename `task_obj` to `task`, added convenience properties
src/schemas/validation_schemas.py	Refactored `ValidationResult` to remove redundant fields, added convenience properties
src/schemas/GENERATION_PIPELINE_SCHEMAS.md	Updated documentation to reflect schema changes and clarify the `reasoning` field purpose
src/base_stages/solve_tasks.py	Updated `TaskSolution` instantiation to use new schema structure
src/base_stages/validate_tasks.py	Updated references from `task_obj` to `task` and from `task` to `task_text`

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-21T21:22:56Z

- `task_obj`: Task (required, Task dataclass object with full hierarchy)
+- `task`: Task (required, Task dataclass object with full hierarchy including task_id, task text, capability, etc.)
+- `solution`: String (required, the final answer/solution to the task)
+- `reasoning`: String (required, the step-by-step explanation generated by the solver showing how the solution was derived. This is the LLM's chain-of-thought when solving the task, NOT Bloom's taxonomy level which is stored in `task.bloom_level`)


The clarification about Bloom's taxonomy appears defensive and may indicate past confusion. Consider simplifying to focus on what 'reasoning' IS rather than what it is NOT. For example: 'the step-by-step explanation generated by the solver showing how the solution was derived (chain-of-thought)'

Suggested change

- `reasoning`: String (required, the step-by-step explanation generated by the solver showing how the solution was derived. This is the LLM's chain-of-thought when solving the task, NOT Bloom's taxonomy level which is stored in `task.bloom_level`)

- `reasoning`: String (required, the step-by-step explanation generated by the solver showing how the solution was derived; this is the model's chain-of-thought for the task. The Bloom's taxonomy level for the task is stored separately in `task.bloom_level`.)

afkanpour

@afkanpour reviewed 5 files and all commit messages, and made 2 comments.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @kohankhaki and @saidul-islam98).

src/schemas/validation_schemas.py line 26 at r1 (raw file):

    def task_id(self) -> str:
        """Get task_id from the task_solution for convenience."""
        return self.task_solution.task_id

Does task_solution contain task_id?

Code quote:

task_id

src/schemas/solution_schemas.py line 31 at r1 (raw file):

    def task_text(self) -> str:
        """Get task text from the task object for convenience."""
        return self.task.task

Does this refer to the problem statement? If so, can you rename it to task_statement?

Code quote:

task

kohankhaki

@kohankhaki made 2 comments.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @afkanpour and @saidul-islam98).

src/schemas/solution_schemas.py line 31 at r1 (raw file):

Previously, afkanpour (Arash) wrote…

Does this refer to the problem statement? If so, can you rename it to task_statement?

Updated all of the task variable name to task_statement wherever we are referring to problem statement

src/schemas/validation_schemas.py line 26 at r1 (raw file):

Previously, afkanpour (Arash) wrote…

Does task_solution contain task_id?

Yes, TaskSolution has a task_id property that returns self.task.task_id.

saidul-islam98 · 2026-01-22T19:16:46Z

For our current pipeline, we are storing the following information in each tasks:

{ "task_id": "Easy_Remember__task_000", "task_statement": "<task_statement>", "choices": { "A": "<option_text>", "B": "<option_text>", "C": "<option_text>", "D": "<option_text>" }, "correct_answer": "<correct answer option>", "chapter_id": "<domain_book_name_chapter_number>", "chapter_relpath": "<bookname/domain_book_name_chapter_number>", # relative path of the chapter "difficulty": "Easy/Medium/Hard", "blooms_level": "Remember/Understand/Apply/Analyze/Evaluate/Create", }

Could you please add "correct_answer", "chapter_id", and "chapter_relpath" fields? (Or maybe I can add and update the schema them in my PR if you think its more appropriate). Also, can the choices be just a dictionary (as there can only be 4 choices in an MCQ question)?

I think the schema should stay as-is to keep it general, but let me know what you think:

Re: choices format
The current List[Dict] format ([{"label": "A", "solution": "..."}]) is more flexible—it allows adding metadata per choice if needed later (e.g., explanation for why an option is wrong). A simple dict works for basic MCQ but limits extensibility.

Re: correct_answer
This belongs in TaskSolution, not Task. The Task represents the problem before it's solved (Stage 3). The correct answer is determined in Stage 4 (solution generation) and stored in TaskSolution.solution.

Re: chapter_id, chapter_relpath
These are domain-specific (textbook structure). Use generation_metadata for these:

Task( task_id="Easy_Remember__task_000", task_statement="...", choices=[{"label": "A", "solution": "..."}, ...], difficulty="Easy", bloom_level="Remember", generation_metadata={ "chapter_id": "domain_book_name_chapter_number", "chapter_relpath": "bookname/domain_book_name_chapter_number", } )

Got it, I will update my saving function accordingly. Thanks.

You should be able to use the standard save_tasks(), load_tasks() etc. from io_utils.py - no need for a custom saving function.

The only difference is how you construct the objects:

task = Task( task_id="Easy_Remember__task_000", task_statement="...", choices=[{"label": "A", "solution": "..."}, ...], difficulty="Easy", bloom_level="Remember", capability=capability, generation_metadata={ "chapter_id": "...", "chapter_relpath": "...", } ) # Standard save/load works save_tasks([task], metadata, path) tasks, metadata = load_tasks(path) # Access your custom fields via generation_metadata chapter_id = tasks[0].generation_metadata.get("chapter_id")

afkanpour

@afkanpour reviewed 7 files and all commit messages, and resolved 2 discussions.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @kohankhaki and @saidul-islam98).

kohankhaki added 2 commits January 21, 2026 00:35

refactor: remove redundant task/task_id fields from TaskSolution and …

adec7a9

…ValidationResult

fix: rename task property to task_text in ValidationResult for consis…

68d9e4c

…tency

kohankhaki requested review from afkanpour, Copilot and saidul-islam98 January 21, 2026 21:22

Copilot AI reviewed Jan 21, 2026

View reviewed changes

afkanpour requested changes Jan 22, 2026

View reviewed changes

kohankhaki commented Jan 22, 2026

View reviewed changes

kohankhaki force-pushed the refactor/remove-redundant-schema-fields branch from 37e01b2 to 68d9e4c Compare January 22, 2026 18:24

refactor: rename task to task_statement and clean up schema fields

252b76f

saidul-islam98 reviewed Jan 22, 2026

View reviewed changes

afkanpour requested a review from saidul-islam98 January 23, 2026 02:02

afkanpour approved these changes Jan 23, 2026

View reviewed changes

kohankhaki merged commit 6bb8af1 into main Jan 23, 2026
0 of 2 checks passed

kohankhaki deleted the refactor/remove-redundant-schema-fields branch January 23, 2026 07:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: remove redundant fields from TaskSolution and ValidationResult#50

refactor: remove redundant fields from TaskSolution and ValidationResult#50
kohankhaki merged 3 commits into
mainfrom
refactor/remove-redundant-schema-fields

kohankhaki commented Jan 21, 2026 •

edited by afkanpour

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

afkanpour left a comment

Uh oh!

kohankhaki left a comment

Uh oh!

saidul-islam98 Jan 22, 2026

Uh oh!

kohankhaki Jan 22, 2026

Uh oh!

saidul-islam98 Jan 22, 2026

Uh oh!

kohankhaki Jan 22, 2026

Uh oh!

afkanpour left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	- `reasoning`: String (required, the step-by-step explanation generated by the solver showing how the solution was derived. This is the LLM's chain-of-thought when solving the task, NOT Bloom's taxonomy level which is stored in `task.bloom_level`)
	- `reasoning`: String (required, the step-by-step explanation generated by the solver showing how the solution was derived; this is the model's chain-of-thought for the task. The Bloom's taxonomy level for the task is stored separately in `task.bloom_level`.)

Conversation

kohankhaki commented Jan 21, 2026 • edited by afkanpour Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Short Description

Tests Added

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

afkanpour left a comment

Choose a reason for hiding this comment

Uh oh!

kohankhaki left a comment

Choose a reason for hiding this comment

Uh oh!

saidul-islam98 Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

kohankhaki Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

saidul-islam98 Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

kohankhaki Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

afkanpour left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kohankhaki commented Jan 21, 2026 •

edited by afkanpour

Loading