Learning a Universal Attention Refinement Module for CLIP-based Open-Vocabulary Segmentation

Official PyTorch implementation of "Learning a Universal Attention Refinement Module for CLIP-based Open-Vocabulary Segmentation".

🎯 Overview

Open-vocabulary segmentation aims to segment novel categories that are not seen during training. This project introduces an Attention Refinement Module (ARM) that significantly enhances CLIP-based open-vocabulary segmentation performance by effectively aggregating multi-level visual features from the CLIP encoder.

📊 Datasets

We evaluate our method on standard semantic segmentation benchmarks:

Dataset	Classes	Type	Download
PASCAL VOC 2012	21 (with background)	Indoor/Outdoor	Official
ADE20K	150	Scene Parsing	Official
COCO 2014	81 (with background)	Instance/Semantic	Official
PASCAL Context	59/60/459	Scene Understanding	Link
COCO-Stuff	172	Stuff Segmentation	GitHub
ADE20K-847	847	Fine-grained	Official

Dataset Structure

Organize datasets as follows:

data/
├── VOCdevkit/
│   └── VOC2012/
│       ├── JPEGImages/
│       ├── SegmentationClass/
│       └── ImageSets/
├── ADE20K/
│   ├── images/
│   └── annotations/
├── coco14/
│   ├── images/
│   └── annotations/
└── ...

Or modify config.py to update dataset paths according to your directory structure.

🙏 Acknowledgements

This project builds upon the following excellent open-source works:

CLIP - Contrastive Language-Image Pre-training by OpenAI
CLIPer - CLIP-based segmentation framework
SCLIP - Semantic CLIP segmentation approach
Cat-Seg - Category-aware segmentation method
MaskCLIP - CLIP-based mask prediction

We sincerely thank the authors for their contributions to the community.

⚠️ Code Release Note

The complete source code and pre-trained weights will be released upon official acceptance of the paper. Stay tuned for updates!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
template		template
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
pamr.py		pamr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning a Universal Attention Refinement Module for CLIP-based Open-Vocabulary Segmentation

🎯 Overview

📊 Datasets

Dataset Structure

🙏 Acknowledgements

⚠️ Code Release Note

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

wincharm001/ARM_CLIP-based

Folders and files

Latest commit

History

Repository files navigation

Learning a Universal Attention Refinement Module for CLIP-based Open-Vocabulary Segmentation

🎯 Overview

📊 Datasets

Dataset Structure

🙏 Acknowledgements

⚠️ Code Release Note

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages