ResNet50 Image Classification Project

A deep learning implementation of ResNet50 architecture for 6-class image classification using Keras/TensorFlow. The provided model is trained to recognise hand gestures.

Overview

This project implements a custom ResNet50 (Residual Network) architecture for classifying images into 6 different classes. ResNet50 is a deep convolutional neural network that uses residual connections to enable training of very deep networks while avoiding the vanishing gradient problem.

Architecture

ResNet50 Structure

The ResNet50 model consists of:

Input Layer: Accepts 64×64×3 RGB images
Initial Convolution: 7×7 conv layer with 64 filters, stride 2
5 Main Stages: Each containing residual blocks with skip connections
Global Average Pooling: Reduces spatial dimensions
Dense Output Layer: 6-class softmax classifier

Key Components

1. Identity Block

Input → Conv1×1 → BN → ReLU → Conv3×3 → BN → ReLU → Conv1×1 → BN → Add(shortcut) → ReLU → Output

Used when input and output dimensions match
Implements the core residual learning: H(x) = F(x) + x
Contains 3 convolutional layers with batch normalization

2. Convolutional Block

Input → Conv1×1 → BN → ReLU → Conv3×3 → BN → ReLU → Conv1×1 → BN
   ↓                                                              ↓
   → Conv1×1 → BN ────────────────────────────────────→ Add → ReLU → Output

Used when input and output dimensions differ
Includes a shortcut path with 1×1 convolution to match dimensions
Enables downsampling between stages

3. Network Stages

Stage	Output Size	Blocks	Filters	Operations
1	32×32	1 conv block	[64,64,256]	Initial feature extraction
2	32×32	1 conv + 2 identity	[64,64,256]	Low-level features
3	16×16	1 conv + 3 identity	[128,128,512]	Mid-level features, 2× downsample
4	8×8	1 conv + 5 identity	[256,256,1024]	High-level features, 2× downsample
5	4×4	1 conv + 2 identity	[512,512,2048]	Abstract features, 2× downsample

How It Works

1. Residual Learning

ResNet addresses the vanishing gradient problem in deep networks through skip connections:

Instead of learning H(x) directly, the network learns the residual F(x) = H(x) - x
The final output becomes H(x) = F(x) + x
This allows gradients to flow directly through skip connections during backpropagation

2. Feature Extraction Pipeline

Early Layers: Extract low-level features (edges, textures)
Middle Layers: Combine features into patterns and shapes
Deep Layers: Learn high-level semantic representations
Global Pooling: Aggregates spatial information
Classifier: Maps features to class probabilities

3. Training Process

Data Preprocessing: Images normalized to [0,1] range
Data Splitting: 70% training, 30% testing
Loss Function: Categorical crossentropy for multi-class classification
Optimizer: Adam with learning rate 0.00015
Target: Automatically trains until >85% accuracy

Project Structure

project/
├── resnet_model.py           # Main implementation
├── load_images.py           # Image loading utility
├── images_with_labels/      # Training dataset
├── images/                  # Test images
└── resnet50_trained.keras   # Saved model

Key Functions

Model Creation

new_model(path_to_model, print_summary=False)

Creates fresh ResNet50 model
Compiles with Adam optimizer
Saves to specified path

Training

train_model(path_to_model, path_to_images)
train_()  # Auto-trains until 85% accuracy on test set

Loads dataset with 70/30 split
Trains for 10 epochs per iteration
Evaluates on test set
Saves improved model

Prediction

predict(path_to_model, path_to_image)

Loads trained model
Preprocesses input image
Returns class probabilities and prediction

Usage

Creating a New Model

new_model('my_model.keras', print_summary=True)

Training

# Single training session
train_model('my_model.keras', 'path/to/images')

# Auto-train until target accuracy
train_()

Making Predictions

predict('resnet50_trained.keras', 'path/to/image.jpg')

Technical Details

Input Size: 64×64×3 RGB images
Classes: 6 (labeled 0-5)
Parameters: ~23M trainable parameters
Memory: Moderate GPU memory requirements
Training Time: Varies with dataset size and hardware

Dependencies

keras
tensorflow
numpy
matplotlib

Model Performance

The model automatically trains until achieving >85% test accuracy. Performance tracking includes:

Training/validation loss curves
Accuracy metrics per epoch
Stage-wise feature map dimensions
Final classification results

Acknowledgments:

This work is highly inspired by DeepLearning.AI Specialization: https://www.coursera.org/learn/convolutional-neural-networks

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitattributes		.gitattributes
README.md		README.md
core.py		core.py
load_images.py		load_images.py
resnet50_trained.keras		resnet50_trained.keras

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ResNet50 Image Classification Project

Overview

Architecture

ResNet50 Structure

Key Components

1. Identity Block

2. Convolutional Block

3. Network Stages

How It Works

1. Residual Learning

2. Feature Extraction Pipeline

3. Training Process

Project Structure

Key Functions

Model Creation

Training

Prediction

Usage

Creating a New Model

Training

Making Predictions

Technical Details

Dependencies

Model Performance

Acknowledgments:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ResNet50 Image Classification Project

Overview

Architecture

ResNet50 Structure

Key Components

1. Identity Block

2. Convolutional Block

3. Network Stages

How It Works

1. Residual Learning

2. Feature Extraction Pipeline

3. Training Process

Project Structure

Key Functions

Model Creation

Training

Prediction

Usage

Creating a New Model

Training

Making Predictions

Technical Details

Dependencies

Model Performance

Acknowledgments:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages