Skip to content

Add documentation for OSL JSON format and dataset tools#37

Open
SilvioGiancola wants to merge 1 commit into
devfrom
docs--improve
Open

Add documentation for OSL JSON format and dataset tools#37
SilvioGiancola wants to merge 1 commit into
devfrom
docs--improve

Conversation

@SilvioGiancola
Copy link
Copy Markdown
Collaborator

  • Updated index.md to include a link to the OSL JSON Format documentation.
  • Expanded tni.md with detailed configuration samples for classification, tracking, and localization tasks, including JSON format examples for annotations.
  • Introduced dataset-conversion.md to outline scripts for building OSL datasets and converting OSL JSON annotations to/from Parquet + WebDataset.
  • Created hf-dataset-transfer.md for scripts to download and upload OSL datasets via Hugging Face Hub.
  • Updated mkdocs.yml to include new documentation sections for Data Formats and Tools.
  • Enhanced README files in the tools directory to reference the OSL JSON schema and provide clearer guidance on usage.

- Updated index.md to include a link to the OSL JSON Format documentation.
- Expanded tni.md with detailed configuration samples for classification, tracking, and localization tasks, including JSON format examples for annotations.
- Introduced dataset-conversion.md to outline scripts for building OSL datasets and converting OSL JSON annotations to/from Parquet + WebDataset.
- Created hf-dataset-transfer.md for scripts to download and upload OSL datasets via Hugging Face Hub.
- Updated mkdocs.yml to include new documentation sections for Data Formats and Tools.
- Enhanced README files in the tools directory to reference the OSL JSON schema and provide clearer guidance on usage.
Copilot AI review requested due to automatic review settings May 19, 2026 16:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the project’s user-facing documentation around the OSL JSON v2.0 dataset/annotation format and the companion dataset tooling (conversion and Hugging Face transfer), and wires the new pages into the MkDocs site navigation.

Changes:

  • Adds a dedicated OSL JSON format guide and links it from the main docs index, repo README, tools readmes, and API docs.
  • Adds new docs pages for dataset conversion and Hugging Face dataset transfer tooling, and updates mkdocs.yml nav accordingly.
  • Expands docs/tni/tni.md with concrete YAML/JSON examples and clarifies inference/prediction persistence (infer() vs save_predictions()).

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tools/README.md Adds a pointer to the OSL JSON schema doc for dataset tools.
tools/download/README.md Clarifies schema dependency and cleans up formatting in the download/upload tooling docs.
tools/convert/README.md Adds OSL JSON schema reference and improves readability of pipeline notes/examples.
README.md Adds OSL JSON format link and includes minimal JSON examples + updated tool script paths.
opensportslib/apis/README.md Re-documents task wrapper behavior via a clearer method contract table and schema reference.
mkdocs.yml Adds new nav sections for Data Formats and Tools pages.
docs/tools/hf-dataset-transfer.md New MkDocs page describing HF download/upload scripts and usage patterns.
docs/tools/dataset-conversion.md New MkDocs page describing conversion/build scripts and round-trip workflows.
docs/tni/tni.md Replaces include-based config snippets with explicit examples and expands OSL JSON annotation guidance.
docs/index.md Adds quick link to the OSL JSON format page and fixes minor formatting.
docs/data/osl-json-format.md New canonical OSL JSON v2.0 format guide with examples, field definitions, and checklist.
docs/contributing.md Replaces include directive with an inlined contributing guide.
docs/api/api.md Documents the public task wrapper contract and links to OSL JSON format docs.
Comments suppressed due to low confidence (1)

docs/tools/hf-dataset-transfer.md:5

  • In this docs page, the reference to the schema uses a repo-root path (docs/data/osl-json-format.md) and is formatted as inline code, so it won’t become a working MkDocs link. Use a relative docs link (e.g., ../data/osl-json-format.md) and make it a markdown link so navigation works on the published site.
Scripts to download and upload OSL datasets via Hugging Face Hub. These tools
read file references from OSL JSON `data[].inputs[]`; see
`docs/data/osl-json-format.md` for the dataset schema.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +10 to +18
- Downloads an OSL split by repo, revision, and split name.
- JSON mode downloads `<split>.json` and all referenced inputs; Parquet mode downloads `<split>/`.
- `download_hf_repo.py`
- Downloads a full HuggingFace repository snapshot for a given repo and revision.
- Best when you want the entire repo content for a branch/tag/commit.
- `upload_osl_hf.py`
- Uploads local dataset inputs from JSON to a HuggingFace dataset repo.
- Automatically creates the target dataset repo if it does not exist.
- Automatically creates the target revision branch when `--revision` is not `main` and the branch is missing.
Comment on lines +1 to +5
# Download Tools

Scripts to download and upload OSL datasets via Hugging Face Hub. These tools
read file references from OSL JSON `data[].inputs[]`; see
`docs/data/osl-json-format.md` for the dataset schema.
Scripts for building OpenSportsLib (OSL) datasets from raw sources, and for
converting OSL JSON annotations to and from a Parquet + WebDataset
representation suited for large-scale training. For the annotation schema, see
the OSL JSON format guide in `docs/data/osl-json-format.md`.
Comment thread docs/contributing.md
### 4. Push Branch (just once)
Push your feature branch to the remote repository.
```bash
git push origin feature/your-feature-name
Comment thread docs/tni/tni.md
Comment on lines 337 to 348
Usage:
```bash
### Load weights from HF ###

#### For Classification ####
myModel.load_weights(weights="OpenSportsLab/OSL-cls-action-mvitv2")

#### For Localization ####
weights = "OpenSportsLab/OSL-loc-snbas-2023-e2e" # SNBAS - 2 classes (E2E spot)
weights = "OpenSportsLab/OSL-loc-snbas-2025-e2e" # SNBAS - 12 classes (E2E spot)
myModel.load_weights(weights=weights)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants