Hung quan | Van Hoang | My Kim | Thien Nhan | Thanh Dat
iccv.webm
Inference result samples on Fisheye1K using 640x640_fisheye8k.engine (FP32).
- Platform: Jetson AGX Xavier (JetPack 5.1.2, L4T R35.4.1)
- TensorRT Version: 8.5.0.2
- Torch Version: 2.1.0 (
torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64) - Torchvision Version: v0.16.1
- Input Resolution: 1024×1024 or 640×640
🔧 You can adjust parameters via the configuration file:
config/config.yaml.
| Model | AP0.5:0.95 | AP0.5 | APS | APM | APL | F1 Score |
|---|---|---|---|---|---|---|
| 1024×1024_fisheye8k + 1024×1024_visdra_m (best) | 0.5238 | 0.7226 | 0.3369 | 0.6877 | 0.5925 | 0.6139 |
| 640×640_fisheye8k (FP32) | 0.5556 | 0.7915 | 0.3810 | 0.6880 | 0.5727 | 0.5995 |
Accuracy for the FP32 model is reported based on the ICCV 2025 evaluation. The FP16 model has not yet been evaluated for accuracy; results shown here reflect FPS only.
| Model | FPS | Normalized (max=25) |
|---|---|---|
| 640×640_fisheye8k (FP32) | 12.09 | 0.4836 |
| 640×640_fisheye8k (FP16) | 21.89 | 0.8756 |
FP16-trained weights are currently not available.
| Model (FP32) | 640×640 Weights | 1024×1024 Weights |
|---|---|---|
| dfine_hgnetv2_m_fisheye8k | Download | Download |
| dfine_hgnetv2_m_visdra | Download | Download |
We use the following command to build .engine files:
⚠️ Note: Using--fp16or--int8on FP32-trained models may cause numerical overflow.
trtexec \
--onnx=model/dfine_640.onnx \
--saveEngine=model/dfine_640.engine \
--memPoolSize=workspace:11000 \
--useCudaGraph \
--best \
--minShapes=images:1x3x640x640,orig_target_sizes:1x2 \
--optShapes=images:1x3x640x640,orig_target_sizes:1x2 \
--maxShapes=images:1x3x640x640,orig_target_sizes:1x2This work is built upon the amazing D-FINE project.