Multi-model sensor fusion for autonomous driving scene understanding.
This project combines three perception models into a unified scene understanding system:
┌─────────────────────────────────────────────────────────────┐
│ CAMERA INPUT │
└─────────────────────────┬───────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ YOLOv8 │ │ U-Net │ │ MiDaS │
│ Detection │ │ Segmentation │ │ Depth │
└───────────────┘ └───────────────┘ └───────────────┘
│ │ │
└─────────────────┼─────────────────┘
▼
┌─────────────────┐
│ FUSION │
└─────────────────┘
| Model | Task | Dataset | Output |
|---|---|---|---|
| YOLOv8-small | Object Detection | BDD100K | Bounding boxes |
| U-Net | Drivable Area | BDD100K | Segmentation mask |
| MiDaS | Depth Estimation | Pretrained | Depth map |
| Component | Metric | Value |
|---|---|---|
| Detection | mAP@50 | ~0.50 |
| Segmentation | IoU | ~0.85 |
| Depth | Relative | Pretrained |
| Ensemble | FPS | ~15-20 |
├── notebooks/
│ ├── 01_Detection.ipynb # YOLOv8 fine-tuning
│ ├── 02_Segmentation.ipynb # U-Net training
│ └── 03_Fusion_ONNX.ipynb # Ensemble + export
├── models/ # ONNX models (see HuggingFace)
├── src/
│ └── inference.py # Inference pipeline
├── demo/
│ └── app.py # Gradio demo
└── assets/
└── sample_output.png
git clone https://github.com/aryanp2107/Autonomous-Perception-Ensemble.git
cd Autonomous-Perception-Ensemble
pip install -r requirements.txtfrom src.inference import PerceptionEnsemble
# Load ensemble
ensemble = PerceptionEnsemble(
detection_model="models/yolov8n_bdd100k.onnx",
segmentation_model="models/unet_drivable.onnx",
depth_model="models/midas_small.onnx"
)
# Run on image
result = ensemble.predict("path/to/dashcam.jpg")
# Visualize
ensemble.visualize(result, save_path="output.png")cd demo
python app.py
# Opens Gradio interface at localhost:7860| Notebook | Description | Colab |
|---|---|---|
| 01_Detection | Fine-tune YOLOv8 on BDD100K | |
| 02_Segmentation | Train U-Net for drivable area | |
| 03_Fusion_ONNX | Combine models + ONNX export |
ONNX models hosted on HuggingFace:
# Download all models
huggingface-cli download aryanp2107/autonomous-perception-ensemble --local-dir models/Or download individually:
BDD100K — Berkeley DeepDrive 100K
We use a ~10K image subset via Roboflow for training.
- Detection: Ultralytics YOLOv8
- Segmentation: PyTorch U-Net
- Depth: Intel MiDaS
- Export: ONNX Runtime
- Demo: Gradio
Deployed on Arxelos (coming soon)
MIT
Aryan Patel