Skip to content

feat: pretraining dataset validation#20

Open
MichaelTj02 wants to merge 4 commits into
mainfrom
dataset-validation-pretrain
Open

feat: pretraining dataset validation#20
MichaelTj02 wants to merge 4 commits into
mainfrom
dataset-validation-pretrain

Conversation

@MichaelTj02

Copy link
Copy Markdown
Collaborator

Summary

Implemented a pretraining dataset validation step that blocks training initialization when the dataset is invalid and provides any detected dataset issues to the user.

Type of change

  • Feature
  • Fix
  • Other (docs, refactor, chore, ci, etc.)

How was this tested?

  • Launch Autolume
  • Set the dataset path in the training module to a folder containing test data
  • Start training with both a valid and an invalid dataset
  • Verify that training proceeds normally for a valid dataset
  • Verify that training is blocked for an invalid dataset and that dataset validation issues are displayed before the training initialization phase begins
2026-06-03.23-27-03.mp4
invalid-dataset

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant