feat(visdrone): live browser demo (path 1) + preview ONNX weights by aalvsz · Pull Request #3 · LibreYOLO/use-cases

aalvsz · 2026-04-25T16:23:36Z

Summary

Closes the "Path 1: pending trained weights" stub from the original VisDrone PR. Paths 1 (browser) and 2 (Python) now both work against a real, MIT-licensed preview model.

What's live

🤗 Preview weights: https://huggingface.co/ander2221/visdrone-yolo9-preview
🌐 Browser demo: `visdrone-finetune/demo/index.html` (zero-install, fetches ONNX from HF on first visit, caches)

The demo:

Self-contained single HTML file matching `blur-faces/demo/index.html` conventions
Inference via onnxruntime-web — WebGPU first, WASM fallback
Image upload + webcam input
Boxes + class labels drawn on a canvas
10 VisDrone classes color-coded
Override source repo via `?repo=org/name` URL param

What's in the PR

File	Role
`demo/index.html`	Browser demo (~300 LOC, vanilla JS module)
`src/export_and_push.py`	torch → ONNX (dynamic batch) + HF Hub upload + auto model card
`src/load_finetuned.py`	Helper reproducing libreyolo's `_rebuild_for_new_classes` path so the trained hybrid checkpoint loads cleanly
`src/train.py`	Drop hardcoded `nb_classes=10` — let the factory auto-detect from COCO weights and the trainer rebuild from `data.yaml`
`src/infer.py`	Use the new load helper
`README.md`	Path 1 banner flipped to "Live (preview)" with honest status
`.gitignore`	Ignore `logs/` and `export/` (transient)

Training honesty

These are preview weights, not production:

Trained on a Mac M-series GPU (Apple Metal Performance Shaders) — not a real datacenter GPU
5 epochs, imgsz=384, batch=8, ~12 min wall clock
Full Voxel51/VisDrone2019-DET train split (7766 images)
Loss dropped 14.9 → 5.4

Detections are real and look right on held-out val images — e.g. 34 cars + 1 bus correctly identified on `9999938_00000_d_0000496.jpg` at conf 0.15+. Confidences are modest (0.2-0.6 typical) because the model is undertrained. A full ~50-epoch run on a real GPU would replace these.

The model card on HF Hub explicitly tags this as v0.1-preview with the same caveats.

Upgrade path

When real weights are trained:

Train on whatever GPU is available (script unchanged).
Run `python -m src.export_and_push --weights weights/visdrone.pt --repo-id LibreYOLO/visdrone-yolo9 ...`
Update the demo's default `HF_REPO` constant in `demo/index.html` (or just expect users to use the `?repo=` URL param).

Test plan

`python -m src.train` runs end-to-end on MPS without errors
`python -m src.infer` produces sensible detections on val images
`python -m src.export_and_push` produces working ONNX and successfully uploads to HF Hub
HF repo is publicly accessible (302 → CDN, content-type ok)
ONNX loads in onnxruntime-cpu, output shape `(1, 14, 3024)` for imgsz=384, range looks right (logits, not probabilities)
Browser demo end-to-end test on a fresh Chrome — pending; see screenshots in next iteration if needed

Resolves the "Path 1: pending trained weights" stub: paths 1 (browser) and 2 (Python) now work against a real, MIT-licensed VisDrone preview model trained locally on Apple Metal. What changed: demo/index.html Self-contained browser demo. Pulls the ONNX from ander2221/visdrone-yolo9-preview on first visit, runs inference via onnxruntime-web (WebGPU → WASM fallback), draws annotated boxes. Webcam + image input. Override the source repo via ?repo=org/name in the URL. src/load_finetuned.py Helper that reproduces libreyolo's _rebuild_for_new_classes path so the trained hybrid checkpoint (80-channel cls intermediate + 10-class final) loads cleanly. src/export_and_push.py End-to-end CLI: torch -> ONNX (dynamic batch), create HF Hub repo, upload .pt + .onnx + auto- generated model card. Used to publish ander2221/visdrone-yolo9-preview. src/train.py Drop the hardcoded nb_classes=10 — let the factory auto-detect from the COCO-pretrained checkpoint, then trainer rebuilds for VisDrone when it reads data.yaml's nc=10. Fixes the previous shape-mismatch on weight load. src/infer.py Use the new load_finetuned helper. README.md Path 1 banner flipped to "Live (preview)" with an honest "5 epochs on Apple Metal" status note and an upgrade path for fully-trained upstream weights. .gitignore Ignore logs/ and export/ which only contain transient training/export artifacts. The preview weights: https://huggingface.co/ander2221/visdrone-yolo9-preview Trained for 5 epochs on the full Voxel51/VisDrone2019-DET train split (7766 images), imgsz=384, batch=8, lr0=0.005, on a Mac M-series MPS GPU (~12 minutes wall clock). Loss dropped 14.9 → 5.4. Real detections on val images (e.g. 34 cars + 1 bus on a held-out frame at conf 0.15+). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(visdrone): live browser demo (path 1) + preview ONNX weights#3

feat(visdrone): live browser demo (path 1) + preview ONNX weights#3
aalvsz wants to merge 1 commit into
LibreYOLO:mainfrom
aalvsz:visdrone-demo-path1

aalvsz commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aalvsz commented Apr 25, 2026

Summary

What's live

What's in the PR

Training honesty

Upgrade path

Test plan

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant