Deep Learning for Visual Computing

Authors

Overview

This project explores various segmentation architectures for deep learning, including Fully Convolutional Networks (FCNs), U-Net, and SegFormer. It focuses on the implementation, training, and evaluation of models for semantic segmentation tasks.

Key Concepts

Segmentation Architectures

Fully Convolutional Networks (FCNs):
Replace fully connected layers with convolutional layers to output spatial maps of class predictions.
U-Net:
Uses an encoder-decoder structure with skip connections, retaining spatial information for improved segmentation.
SegFormer:
A hierarchical transformer-based model that processes image patches and uses Multi-Layer Perceptrons (MLPs) for upscaling and classification.

Techniques

Downsampling and Upsampling

Downsampling: Reduces spatial dimensions for computational efficiency using pooling and strided convolutions.
Upsampling: Restores feature maps to original resolution using interpolation or transposed convolutions.

Pre-training and Fine-tuning

Pre-training: Uses a general dataset to learn initial features.
Fine-tuning: Adapts features for specific datasets, improving performance with limited data.

Experimental Results

FCN Models

Trained on the OxfordIIITPet dataset:

Pre-trained Encoder vs Scratch Training:
- Pre-trained model showed faster convergence and higher mIoU.
- Validation plots confirmed the advantage of pre-training.

SegFormer Models

Fine-tuned using the OxfordIIITPet dataset:

Methods:
- Freezing Encoder: Encoder weights frozen during training.
- No Freezing: Encoder weights updated during training.
Observations:
- "No Freezing" method showed better convergence.
- Validation loss and mIoU plots indicated underfitting in early stages.

Discussion

Downsampling and Upsampling: Essential for balancing computational load and prediction accuracy.
Pre-training and Fine-tuning: Effective for tasks with limited data, though performance varies based on dataset similarity.
Challenges: Fine-tuning can lead to underfitting when initial and target tasks differ significantly.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
__pycache__		__pycache__
dlvc		dlvc
img		img
plots		plots
saved_models		saved_models
.gitignore		.gitignore
README.md		README.md
instructions.md		instructions.md
requirements.txt		requirements.txt
train.py		train.py
train_segformer.py		train_segformer.py
viz_pets.py		viz_pets.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning for Visual Computing

Authors

Overview

Key Concepts

Segmentation Architectures

Techniques

Downsampling and Upsampling

Pre-training and Fine-tuning

Experimental Results

FCN Models

SegFormer Models

Discussion

Repository Structure

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deep Learning for Visual Computing

Authors

Overview

Key Concepts

Segmentation Architectures

Techniques

Downsampling and Upsampling

Pre-training and Fine-tuning

Experimental Results

FCN Models

SegFormer Models

Discussion

Repository Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages