OpenSportsLab · SilvioGiancola · May 19, 2026
diff --git a/README.md b/README.md
@@ -22,6 +22,7 @@ OpenSportsLib is designed for **researchers, ML engineers, and sports analytics
 ## Quick links
 
 - **Documentation:** https://opensportslab.github.io/opensportslib/
+- **OSL JSON format:** https://opensportslab.github.io/opensportslib/data/osl-json-format/
 - **PyPI:** https://pypi.org/project/opensportslib/
 - **Issues:** https://github.com/OpenSportsLab/opensportslib/issues
 
@@ -81,7 +82,82 @@ Use it as the main entry point to find:
 See the [Model Zoo](docs/model-zoo.md) for available pretrained models,
 reported scores, datasets, and loading snippets.
 
---
+---
+
+## Dataset format
+
+OpenSportsLib annotation files use the **OSL JSON v2.0** format. A dataset JSON
+contains top-level metadata, a shared `labels` schema, and a `data` array where
+each sample points to one or more inputs.
+
+Minimal classification sample:
+
+```json
+{
+  "labels": {
+    "action": {
+      "type": "single_label",
+      "labels": ["pass", "shot"]
+    }
+  },
+  "data": [
+    {
+      "id": "clip_0001",
+      "inputs": [
+        {
+          "type": "video",
+          "path": "clips/clip_0001.mp4",
+          "fps": 25.0
+        }
+      ],
+      "labels": {
+        "action": {
+          "label": "shot"
+        }
+      }
+    }
+  ]
+}
+```
+
+Minimal localization sample:
+
+```json
+{
+  "labels": {
+    "action": {
+      "type": "single_label",
+      "labels": ["pass", "shot"]
+    }
+  },
+  "data": [
+    {
+      "id": "game_0001",
+      "inputs": [
+        {
+          "type": "video",
+          "path": "games/game_0001.mp4",
+          "fps": 25.0
+        }
+      ],
+      "events": [
+        {
+          "head": "action",
+          "label": "pass",
+          "position_ms": 1240
+        }
+      ]
+    }
+  ]
+}
+```
+
+Relative paths in `inputs[].path` are resolved from the split media root in the
+YAML config, for example `DATA.train.video_path`. See the full
+[OSL JSON format guide](docs/data/osl-json-format.md) for field definitions,
+multi-modal examples, prediction payloads, and conversion notes.
+
+---
 
 ## Quickstart
 
@@ -188,8 +264,8 @@ from opensportslib.tools import (
 ### Scripts
 
 ```bash
-python tools/download_osl_hf.py --repo-id <org/repo> --revision main --split test --format parquet --output-dir downloaded_data
-python tools/upload_osl_hf.py --repo-id <org/repo> --json-path <local_dataset.json> --split test --revision main
+python tools/download/download_osl_hf.py --repo-id <org/repo> --revision main --split test --format parquet --output-dir downloaded_data
+python tools/download/upload_osl_hf.py --repo-id <org/repo> --json-path <local_dataset.json> --split test --revision main
 ```
 
 Downloads are placed under `<output-dir>/<revision>/<split>`.
@@ -206,9 +282,13 @@ Predict when key events happen in long untrimmed sports videos.
 
 ### Action Retrieval
 Search and retrieve relevant clips or moments from a collection of sports videos.
+This is part of the roadmap and OSL data model, not a first-class OpenSportsLib
+training workflow yet.
 
 ### Action Description / Captioning
 Generate text descriptions for sports events and temporal segments.
+This is part of the roadmap and OSL data model, not a first-class OpenSportsLib
+training workflow yet.
 
 ---
 
@@ -228,6 +308,7 @@ Generate text descriptions for sports events and temporal segments.
 Use the README for the fast start, then go deeper through:
 
 - Full documentation: https://opensportslab.github.io/opensportslib/
+- OSL JSON format: [docs/data/osl-json-format.md](docs/data/osl-json-format.md)
 - High-level API guide: [opensportslib/apis/README.md](opensportslib/apis/README.md)
 - Configuration guide: https://opensportslab.github.io/opensportslib/tni/config-guide/
 - Example configs: [examples/configs/](examples/configs/)

diff --git a/docs/api/api.md b/docs/api/api.md
@@ -38,6 +38,32 @@ High-level entry points for training and inference.
 - **`localization.py`**  
   API for temporal action spotting tasks.
 
+#### Public task wrapper contract
+
+Use the high-level wrappers from `opensportslib.apis`:
+
+```python
+from opensportslib.apis import ClassificationModel, LocalizationModel
+```
+
+Both wrappers inherit the shared `BaseTaskModel` contract:
+
+| Method | Purpose | Return value |
+| --- | --- | --- |
+| `load_weights(weights=...)` | Load a local checkpoint or Hugging Face model ID. | `None` |
+| `train(train_set=..., valid_set=...)` | Train on OSL JSON split files. | Best checkpoint path or `None` |
+| `infer(test_set=...)` | Run prediction on an OSL JSON split file. | In-memory OSL JSON-style prediction dict |
+| `evaluate(test_set=...)` | Compute task metrics against ground truth. | Metrics dict |
+| `evaluate(test_set=..., predictions=...)` | Evaluate an existing prediction dict or prediction file. | Metrics dict |
+| `save_predictions(output_path=..., predictions=...)` | Persist a prediction dict returned by `infer()`. | Saved file path |
+
+`infer()` is prediction-focused and returns a payload to the caller. Use
+`save_predictions(...)` when a workflow needs an explicit prediction file. Do
+not rely on task-specific trainer artifacts as the public persistence API.
+
+Annotation and prediction payloads follow the OSL JSON data model. See
+[OSL JSON Format](../data/osl-json-format.md) for the user-facing schema.
+
 ---
 
 ### `core/`
@@ -194,4 +220,4 @@ This is where you can modify:
 
 ### High-Level Workflow
 
-YAML Config -> APIs (apis/) -> Datasets (datasets/) -> Models (models/) -> Trainer (core/trainer/) -> Metrics (metrics/)
+YAML Config -> APIs (apis/) -> Datasets (datasets/) -> Models (models/) -> Trainer (core/trainer/) -> Metrics (metrics/)
diff --git a/docs/contributing.md b/docs/contributing.md
@@ -1 +1,102 @@
---8<-- "CONTRIBUTING.md"
+# CONTRIBUTING.md
+This guide outlines the workflow and standards for developers looking to extend or maintain the OpenSportsLib library.
+
+## AI Agent Contributions
+For AI-agent driven development, follow `AGENTS.md` in the repository root.
+
+## 1. Development Environment Setup
+To begin contributing, set up a local development environment in "editable" mode so your changes are immediately reflected in the package.
+
+#### Step 1: Clone the Repository
+```bash
+git clone https://github.com/OpenSportsLab/opensportslib.git 
+cd opensportslib
+```
+#### Step 2: Create a Virtual Environment
+Use Conda to manage dependencies and ensure Python 3.12 compatibility.
+```bash
+conda create -n osl python=3.12 pip
+conda activate osl
+```
+#### Step 3: Install in Editable Mode
+Install the base package or include optional dependencies for specific tasks like localization:
+```bash
+# Install core package in editable mode
+pip install -e .
+```
+
+#### Step 4: Setup Environment (PyTorch, CUDA aware & Optional Dependencies)
+```bash
+# Install PyTorch (CPU/GPU auto-detected)
+opensportslib setup
+
+# Optional: install PyTorch Geometric support
+opensportslib setup --pyg
+
+# Optional: install for DALI support
+opensportslib setup --dali
+```
+
+## 2. Branching and Merging - Daily workflow for developers
+
+#### Branches
+*main* → stable, production-ready
+*dev* → active development integration branch
+*dev-<name>* → developer personal branch
+*feature-<name>* → new features
+*fix-<name>* → bug fixes
+
+#### Rules
+- ❌ Never push directly to `main`
+- ❌ Never commit directly to `dev`
+- ✅ Always create a feature branch from `dev`
+- ✅ Always use Pull Requests
+- ✅ PRs must target `dev`, NOT `main`
+
+### 1. Sync Repo
+Verify your current branch is `dev` and pull the latest changes before starting work.
+```bash
+git checkout dev
+git pull origin dev
+```
+
+### 2. Create Feature Branch
+Create a new branch from the `dev` source using descriptive naming conventions.
+```bash
+git checkout -b feature-<feature_name>
+```
+Naming Examples:
+- *feature-model*
+- *feature-new-dataset*
+
+### 3. Work Locally
+Commit your work often using the following commit style guidelines:
+
+- *feat:* New feature
+- *fix:* Bug fix
+- *refactor:* Code cleanup
+- *docs:* Documentation update
+
+Example commit:
+```bash
+git add . 
+or 
+git add -u
+
+git commit -m "feat: add model registry"
+```
+
+### 4. Push Branch (just once)
+Push your feature branch to the remote repository.
+```bash
+git push origin feature/your-feature-name
+```
+
+### 5. Open Pull Request (PR) → dev
+Raise a Pull Request (PR) to merge your branch back into the `dev` branch.
+
+✅ PR Checklist:
+- [ ] Tests Pass: All existing logic remains functional.
+- [ ] Runs on GPU: Code is compatible with CUDA environments.
+- [ ] Config Works: YAML configurations resolve correctly.
+- [ ] Docs Updated: Relevant documentation reflects your changes.