feat(visdrone): live browser demo (path 1) + preview ONNX weights#3
Open
aalvsz wants to merge 1 commit into
Open
feat(visdrone): live browser demo (path 1) + preview ONNX weights#3aalvsz wants to merge 1 commit into
aalvsz wants to merge 1 commit into
Conversation
Resolves the "Path 1: pending trained weights" stub: paths 1 (browser)
and 2 (Python) now work against a real, MIT-licensed VisDrone preview
model trained locally on Apple Metal.
What changed:
demo/index.html Self-contained browser demo. Pulls the ONNX
from ander2221/visdrone-yolo9-preview on first
visit, runs inference via onnxruntime-web
(WebGPU → WASM fallback), draws annotated
boxes. Webcam + image input. Override the
source repo via ?repo=org/name in the URL.
src/load_finetuned.py Helper that reproduces libreyolo's
_rebuild_for_new_classes path so the trained
hybrid checkpoint (80-channel cls intermediate
+ 10-class final) loads cleanly.
src/export_and_push.py End-to-end CLI: torch -> ONNX (dynamic batch),
create HF Hub repo, upload .pt + .onnx + auto-
generated model card. Used to publish
ander2221/visdrone-yolo9-preview.
src/train.py Drop the hardcoded nb_classes=10 — let the
factory auto-detect from the COCO-pretrained
checkpoint, then trainer rebuilds for VisDrone
when it reads data.yaml's nc=10. Fixes the
previous shape-mismatch on weight load.
src/infer.py Use the new load_finetuned helper.
README.md Path 1 banner flipped to "Live (preview)" with
an honest "5 epochs on Apple Metal" status
note and an upgrade path for fully-trained
upstream weights.
.gitignore Ignore logs/ and export/ which only contain
transient training/export artifacts.
The preview weights:
https://huggingface.co/ander2221/visdrone-yolo9-preview
Trained for 5 epochs on the full Voxel51/VisDrone2019-DET train split
(7766 images), imgsz=384, batch=8, lr0=0.005, on a Mac M-series MPS
GPU (~12 minutes wall clock). Loss dropped 14.9 → 5.4. Real detections
on val images (e.g. 34 cars + 1 bus on a held-out frame at conf 0.15+).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the "Path 1: pending trained weights" stub from the original VisDrone PR. Paths 1 (browser) and 2 (Python) now both work against a real, MIT-licensed preview model.
What's live
🤗 Preview weights: https://huggingface.co/ander2221/visdrone-yolo9-preview
🌐 Browser demo: `visdrone-finetune/demo/index.html` (zero-install, fetches ONNX from HF on first visit, caches)
The demo:
What's in the PR
Training honesty
These are preview weights, not production:
Detections are real and look right on held-out val images — e.g. 34 cars + 1 bus correctly identified on `9999938_00000_d_0000496.jpg` at conf 0.15+. Confidences are modest (0.2-0.6 typical) because the model is undertrained. A full ~50-epoch run on a real GPU would replace these.
The model card on HF Hub explicitly tags this as v0.1-preview with the same caveats.
Upgrade path
When real weights are trained:
Test plan
Related
🤖 Generated with Claude Code