Add documentation for OSL JSON format and dataset tools#37
Open
SilvioGiancola wants to merge 1 commit into
Open
Add documentation for OSL JSON format and dataset tools#37SilvioGiancola wants to merge 1 commit into
SilvioGiancola wants to merge 1 commit into
Conversation
Collaborator
SilvioGiancola
commented
May 19, 2026
- Updated index.md to include a link to the OSL JSON Format documentation.
- Expanded tni.md with detailed configuration samples for classification, tracking, and localization tasks, including JSON format examples for annotations.
- Introduced dataset-conversion.md to outline scripts for building OSL datasets and converting OSL JSON annotations to/from Parquet + WebDataset.
- Created hf-dataset-transfer.md for scripts to download and upload OSL datasets via Hugging Face Hub.
- Updated mkdocs.yml to include new documentation sections for Data Formats and Tools.
- Enhanced README files in the tools directory to reference the OSL JSON schema and provide clearer guidance on usage.
- Updated index.md to include a link to the OSL JSON Format documentation. - Expanded tni.md with detailed configuration samples for classification, tracking, and localization tasks, including JSON format examples for annotations. - Introduced dataset-conversion.md to outline scripts for building OSL datasets and converting OSL JSON annotations to/from Parquet + WebDataset. - Created hf-dataset-transfer.md for scripts to download and upload OSL datasets via Hugging Face Hub. - Updated mkdocs.yml to include new documentation sections for Data Formats and Tools. - Enhanced README files in the tools directory to reference the OSL JSON schema and provide clearer guidance on usage.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR expands the project’s user-facing documentation around the OSL JSON v2.0 dataset/annotation format and the companion dataset tooling (conversion and Hugging Face transfer), and wires the new pages into the MkDocs site navigation.
Changes:
- Adds a dedicated OSL JSON format guide and links it from the main docs index, repo README, tools readmes, and API docs.
- Adds new docs pages for dataset conversion and Hugging Face dataset transfer tooling, and updates
mkdocs.ymlnav accordingly. - Expands
docs/tni/tni.mdwith concrete YAML/JSON examples and clarifies inference/prediction persistence (infer()vssave_predictions()).
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/README.md | Adds a pointer to the OSL JSON schema doc for dataset tools. |
| tools/download/README.md | Clarifies schema dependency and cleans up formatting in the download/upload tooling docs. |
| tools/convert/README.md | Adds OSL JSON schema reference and improves readability of pipeline notes/examples. |
| README.md | Adds OSL JSON format link and includes minimal JSON examples + updated tool script paths. |
| opensportslib/apis/README.md | Re-documents task wrapper behavior via a clearer method contract table and schema reference. |
| mkdocs.yml | Adds new nav sections for Data Formats and Tools pages. |
| docs/tools/hf-dataset-transfer.md | New MkDocs page describing HF download/upload scripts and usage patterns. |
| docs/tools/dataset-conversion.md | New MkDocs page describing conversion/build scripts and round-trip workflows. |
| docs/tni/tni.md | Replaces include-based config snippets with explicit examples and expands OSL JSON annotation guidance. |
| docs/index.md | Adds quick link to the OSL JSON format page and fixes minor formatting. |
| docs/data/osl-json-format.md | New canonical OSL JSON v2.0 format guide with examples, field definitions, and checklist. |
| docs/contributing.md | Replaces include directive with an inlined contributing guide. |
| docs/api/api.md | Documents the public task wrapper contract and links to OSL JSON format docs. |
Comments suppressed due to low confidence (1)
docs/tools/hf-dataset-transfer.md:5
- In this docs page, the reference to the schema uses a repo-root path (
docs/data/osl-json-format.md) and is formatted as inline code, so it won’t become a working MkDocs link. Use a relative docs link (e.g.,../data/osl-json-format.md) and make it a markdown link so navigation works on the published site.
Scripts to download and upload OSL datasets via Hugging Face Hub. These tools
read file references from OSL JSON `data[].inputs[]`; see
`docs/data/osl-json-format.md` for the dataset schema.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+10
to
+18
| - Downloads an OSL split by repo, revision, and split name. | ||
| - JSON mode downloads `<split>.json` and all referenced inputs; Parquet mode downloads `<split>/`. | ||
| - `download_hf_repo.py` | ||
| - Downloads a full HuggingFace repository snapshot for a given repo and revision. | ||
| - Best when you want the entire repo content for a branch/tag/commit. | ||
| - `upload_osl_hf.py` | ||
| - Uploads local dataset inputs from JSON to a HuggingFace dataset repo. | ||
| - Automatically creates the target dataset repo if it does not exist. | ||
| - Automatically creates the target revision branch when `--revision` is not `main` and the branch is missing. |
Comment on lines
+1
to
+5
| # Download Tools | ||
|
|
||
| Scripts to download and upload OSL datasets via Hugging Face Hub. These tools | ||
| read file references from OSL JSON `data[].inputs[]`; see | ||
| `docs/data/osl-json-format.md` for the dataset schema. |
| Scripts for building OpenSportsLib (OSL) datasets from raw sources, and for | ||
| converting OSL JSON annotations to and from a Parquet + WebDataset | ||
| representation suited for large-scale training. For the annotation schema, see | ||
| the OSL JSON format guide in `docs/data/osl-json-format.md`. |
| ### 4. Push Branch (just once) | ||
| Push your feature branch to the remote repository. | ||
| ```bash | ||
| git push origin feature/your-feature-name |
Comment on lines
337
to
348
| Usage: | ||
| ```bash | ||
| ### Load weights from HF ### | ||
|
|
||
| #### For Classification #### | ||
| myModel.load_weights(weights="OpenSportsLab/OSL-cls-action-mvitv2") | ||
|
|
||
| #### For Localization #### | ||
| weights = "OpenSportsLab/OSL-loc-snbas-2023-e2e" # SNBAS - 2 classes (E2E spot) | ||
| weights = "OpenSportsLab/OSL-loc-snbas-2025-e2e" # SNBAS - 12 classes (E2E spot) | ||
| myModel.load_weights(weights=weights) | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.