This project investigates how different residual block architectures impact the learning behavior and performance of Residual Networks (ResNet). Specifically, it implements a "Mini-ResNet" and evaluates four distinct modular variants of residual blocks on a 25-class subset of the Tiny-ImageNet dataset.
The objective is to perform an ablation study on residual block designs, comparing "Post-Activation" vs. "Pre-Activation" strategies and "Basic" vs. "Bottleneck" configurations.
- Basic Post-Activation: Standard ResNet block with ReLU after the addition.
- Bottleneck Post-Activation: Uses 1x1 convolutions to reduce and then restore dimensions (bottleneck) with post-activation.
- Basic Pre-Activation: Standard block with Batch Normalization and ReLU before the convolutions.
- Bottleneck Pre-Activation: Bottleneck configuration with pre-activation.
The custom Mini-ResNet is designed for 64x64 input images and consists of:
- Stem: 3x3 Conv (32 filters, stride 1) → BatchNorm → ReLU.
- Stage 1: 2 Residual Blocks (32 channels).
- Stage 2: 2 Residual Blocks (64 channels, stride 2 downsampling in the first block).
- Stage 3: 2 Residual Blocks (128 channels, stride 2 downsampling in the first block).
- Head: Global Average Pooling (GAP) → Fully Connected Layer (25 classes).
- Source: A subset of the Tiny-ImageNet dataset.
- Classes: 25 randomly selected classes (seeded using the last 3 digits of a matriculation number).
- Preprocessing:
- Images normalized using ImageNet statistics:
mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225]. - Data loaders handle class isolation and label remapping.
- Images normalized using ImageNet statistics:
- Framework: PyTorch
- Optimization: SGD with momentum (0.9), Weight Decay (5e-4), and Cross-Entropy Loss.
- Metrics Tracked:
- Training/Validation Loss and Accuracy.
- Model Size (Parameters).
- Computational Complexity (MACs).
- Inference Latency (averaged over 100 iterations with 10 warm-up runs).
Project3_Results.ipynb: The primary notebook containing the ablation study results, comparative analysis, and visualizations.models.py: Implementation of theMiniResNetand the four modular block types.data.py: Data loading and preprocessing scripts for Tiny-ImageNet-25.train_all.py: Script to train all four variants sequentially and save results.utils.py: Utility functions for metrics calculation and visualization.
Detailed empirical results, including loss/accuracy curves and performance tables, are available in the Project3_Results.ipynb notebook.
This project was completed as part of the CG3201 Coursework.