This is the official implementation of the TRO paper Spatial Balancing for RGB-Thermal Semantic Segmentation in Autonomous Driving: A Study from Analysis to Improvement.
Click the image above to watch the demonstration video.
We propose a Gaussian-guided regional balancing masking method to balance segmentation performance across different image regions. Moreover, we introduce a spatial-weighted loss to further enhance the overall segmentation performance. Experimental results on MFNet dataset and KP dataset demonstrate the effectiveness of our method in mitigating spatial bias and improving balanced performance.
Place them in the 'datasets' folder in the following structure:
<datasets>
|-- <MFdataset>
|-- <RGB>
|-- <Thermal>
|-- <Label>
|-- train.txt
|-- val.txt
|-- test.txt
|-- <KPdataset>
|-- <images>
|-- set00
|-- set01
...
|-- <labels>
|-- train.txt
|-- val.txt
|-- test.txtFor usage instructions, please refer to CRM.
We offer the pre-trained weights on two RGB-T semantic segmentation dataset.
| Architecture | Backbone | mIOU | Weight (Google Drive) | Weight (NAS) |
|---|---|---|---|---|
| Ours | Swin-T | 59.4% | MF_swin_T | MF_swin_T |
| Ours | Swin-S | 62.1% | MF_swin_S | MF_swin_S |
| Ours | Swin-B | 64.6% | MF_swin_B | MF_swin_B |
| Architecture | Backbone | mIOU | Weight (Google Drive) | Weight (NAS) |
|---|---|---|---|---|
| Ours | Swin-T | 52.3% | KP_swin_T | KP_swin_T |
| Ours | Swin-S | 54.9% | KP_swin_S | KP_swin_S |
| Ours | Swin-B | 56.8% | KP_swin_B | KP_swin_B |
If you use our work in your research, please cite:
@ARTICLE{li2026spatial,
author={Haotian Li and Henry K. Chu and Yuxiang Sun},
journal={IEEE Transactions on Robotics},
title={Spatial Balancing for RGB-Thermal Semantic Segmentation in Autonomous Driving: A Study From Analysis to Improvement},
year={2026},
volume={42},
number={},
pages={1840-1855},
doi={10.1109/TRO.2026.3677009}}
Our network architecture and codebase are built upon CRM.
The inspiration and analytical approach of this paper are draw from ZoneEval.
