Skip to content

Refactor DPO dataset construction to remove hardcoded parameters in and TODOs in dpo.py in examples #498

@Ashish-Patnaik

Description

@Ashish-Patnaik

Description
In examples/dpo.py, the _make_dataset function currently relies on hardcoded values for max_length (512) and batch_size (16). There is also an outstanding TODO comment (# TODO(epot): !!!!) explicitly flagging this section for cleanup.

This differs from other examples like seq2seq.py and lora.py, where these parameters are defined in get_config() and passed down as arguments. This inconsistency makes it harder to configure the DPO training run without modifying the internal logic of the dataset builder.

Solution
I propose refactoring examples/dpo.py to match the pattern used in the other example scripts:

  1. Move the batch_size and max_length definitions to the top of get_config().
  2. Update _make_dataset to accept these values as keyword arguments (*, training, batch_size, max_length).
  3. Remove the placeholder TODO comment.

Alternatives I've considered
Leaving it as-is works functionality-wise, but it leaves technical debt in the codebase and makes the examples inconsistent for new users trying to learn the library.

Additional context
I am happy to open a PR to standardize this example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions