Browser-based YOLOv8 inference benchmark using ONNX Runtime Web : QuickInfer-web
- Run YOLOv8 object detection directly in the browser
- Support for multiple backends: WASM (CPU), WebGPU, WebGL
- Multiple model sizes: YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, YOLOv8x
- Real-time performance metrics: preprocessing, inference, postprocessing
- Mobile browser inference with camera input support
npm install
npm run devOpen https://localhost:5173 in your browser.
npm run build:githubThe built files will be in the dist directory.
Models are hosted on ModelScope/quick-infer-models to ensure proper CORS support for browser downloads.
Default models:
- YOLOv8n (~12MB)
- YOLOv8s (~43MB)
- YOLOv8m (~99MB)
- YOLOv8l (~167MB)
- YOLOv8x (~260MB)
Edit .env or .env.github:
VITE_MODELSCOPE_REPO=your-username/your-model-repo
VITE_MODELS=[{"name":"your-model.onnx","size":"~50MB"}]
Upload ONNX models to your ModelScope repository.
| Backend | Chrome | Edge | Firefox | Safari |
|---|---|---|---|---|
| WASM | ✓ | ✓ | ✓ | ✓ |
| WebGL | ✓ | ✓ | ✓ | ✓ |
| WebGPU | ✓ | ✓ | Partial | ✗ |
WebGPU provides the best performance but requires browser support.
[todo]
- add repo in the web foot
- yolov11\yolov26 support
- sam3 support