Skip to content

Various documentation improvements needed #82

@philipp-fischer

Description

@philipp-fischer
  • Explain that the expressions used for splitting shards into train/val during prepare are regexes
  • Explain how the seed offset can be used to get new random orders, and that by default the order is always the same
  • Mention SkipSample and explain when to use it
  • Update list of sample types in basic/data_prep. Some are missing.
  • Add little how to on automated dataset preparation (non-interactive)
  • Metadataset options not fully documented (e.g. dataset_config)
  • Difference between get_val_dataset and get_val_datasets and also fix their docstrings

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for Enhancement.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions