This repository demonstrates the training of a GoogLeNet model on the MNIST dataset using PyTorch. The dataset is resized and normalized to suit the GoogLeNet architecture, and the implementation achieves high accuracy.
GoogLeNet, also known as Inception v1, is a convolutional neural network architecture that uses inception modules to capture multi-scale features efficiently. This implementation uses the standard GoogLeNet variant.

Set up the environment using the following commands:
conda create --name pytorch_env python=3.11.9 --file requirements.txt
conda activate pytorch_env- The dataset is loaded using
torchvision.datasets.MNIST. - Images are resized to 32x32 (to fit GoogLeNet requirements).
- Images are normalized and converted to tensors.
The dataset is split as follows:
- Training Set: 42,000 samples
- Validation Set: 14,000 samples
- Test Set: 14,000 samples
- GoogLeNet is used with modifications:
- Adjusted for single-channel (grayscale) input.
- Output layer configured for 10 classes.
- The model is moved to GPU if available.
- Loss Function: CrossEntropyLoss
- Optimizer: Adam with a learning rate of 0.001
The model is trained for 10 epochs using the following key steps:
- The training set is passed through the model to compute predictions.
- The loss is calculated using CrossEntropyLoss.
- Gradients are computed and weights are updated using the Adam optimizer.
- The model achieves a test accuracy of 99.04%.
Plots showing the training and validation loss over epochs:
A heatmap of the confusion matrix shows the model's performance across all classes:
- Python 3.11.9
- PyTorch
- torchvision
- numpy
- matplotlib
- PIL (Pillow)
For a detailed list of dependencies, refer to requirements.txt.

