LearningSA

| Paper | Data|

LearningSA: Learning Semantic Alignment using Global Features and Multi-scale Confidence
Huaiyuan Xu, Jing Liao, Huaping Liu, and Yuxiang Sun*

Introduction

Semantic alignment aims to establish pixel correspondences between images based on semantic consistency. It can serve as a fundamental component for various downstream computer vision tasks, such as style transfer and exemplar-based colorization, etc. Many existing methods use local features and their cosine similarities to infer semantic alignment. However, they struggle with significant intra-class variation of objects, such as appearance, size, etc. In other words, contents with the same semantics tend to be significantly different in vision. To address this issue, we propose a novel deep neural network of which the core lies in global feature enhancement and adaptive multi-scale inference. Specifically, two modules are proposed: an enhancement transformer for enhancing semantic features with global awareness; a probabilistic correlation module for adaptively fusing multi-scale information based on the learned confidence scores. We use the unified network architecture to achieve two types of semantic alignment, namely, cross-object semantic alignment and cross-domain semantic alignment. Experimental results demonstrate that our method achieves competitive performance on five standard cross-object semantic alignment benchmarks, and outperforms the state of the arts in cross-domain semantic alignment.

Method

Method Pipeline:

Samples of the CroDom Dataset

Category: person

Get Started

Installation and Data Preparation

step 1. Please prepare environment as that in Docker.

step 2. Prepare LearningSA repo by.

git clone https://github.com/lab-sun/LearningSA.git
cd LearningSA

step 3. Download data and arrange the folder as:

LearningSA/
└── data
    ├── code 
    ├── crodom
    ├── PF_Pascal
    └── PF-dataset
    bbox_test_pairs_pf_pascal.csv
    bbox_test_pairs_pf.csv
    bbox_val_pairs_pf_pascal.csv
    train_pairs_pf_pascal.csv

Train model

python train.py
# if specify gpu:xx for training
python train.py --gpu xx

Test model

# PF-PASCAL
python image.py
# PF-WILLOW
python image_willow.py

Visualize the predicted result

python image.py --vis

Checkpoints

Download checkpoints and save them to the folder weights:

Config	PCK(0.05)	PCK(0.10)	Model
best_checkpoint_pascal	81.4	93.4	gdrive
best_checkpoint_willow	55.6	80.4	gdrive

Applications

Style transfer, which transfers local styles from an exemplar image to regions of the input image with the same semantics:

Semantic alignment-based image morphing:

Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entries.

@ARTICLE{xu2024learning,
  author={Huaiyuan Xu and Jing Liao and Huaping Liu and Yuxiang Sun},
  journal={IEEE Transactions on Circuits and Systems for Video Technology}, 
  title={Learning Semantic Alignment Using Global Features and Multi-Scale Confidence}, 
  year={2024},
  volume={34},
  number={2},
  pages={897-910},
  doi={10.1109/TCSVT.2023.3288370}}

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
docker		docker
geotnf		geotnf
gluoncv		gluoncv
imgs		imgs
models		models
util		util
README.md		README.md
custom_dataset.py		custom_dataset.py
custom_loss.py		custom_loss.py
image.py		image.py
image_willow.py		image_willow.py
model.py		model.py
pip_list.txt		pip_list.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LearningSA

| Paper | Data|

Introduction

Method

Samples of the CroDom Dataset

Get Started

Installation and Data Preparation

Train model

Test model

Visualize the predicted result

Checkpoints

Applications

Bibtex

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LearningSA

| Paper | Data|

Introduction

Method

Samples of the CroDom Dataset

Get Started

Installation and Data Preparation

Train model

Test model

Visualize the predicted result

Checkpoints

Applications

Bibtex

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages