This project implements a fully connected neural network from scratch using NumPy to classify handwritten digits from the MNIST dataset.
The goal of the project was to understand the internal mechanics of nerual networks without relying on deep learning framrworks.
Input layer: 784 (28x28 flattened image) Hidden layer: 256 neurons with ReLU activation Output layer: 10 neurons with Softmax activation
- Forward propagation
- Backpropagation
- Cross-entropy loss
- Mini-batch stochastic gradient descent
- Learning rate scheduling
- Validation set monitoring
- L2 regularisation
- Early checkpoint selection
Test Accuracy: ~98%
Training time: ~1 minute on CPU
- Loads the MNIST dataset using tensorflow.keras.datasets
- Normalises image values
- Flattens images into 784-dimensional vectors
- Splits training into training and validation sets
- Weight initialisation
- Forward propagation
- Backpropagation
- Gradient Descent Update
- Training step logic
- Accuracy and loss computation
- Full training of the model
Includes functions used by the network:
- ReLU activation + gradient
- Softmax
- Cross-entropy
- Linear-layer computation
Contains a visualisation of the output on the trained model (random state = 42 for replication)
Lists Python dependencies required
Install dependencies: pip install -r requirements.txt
Run training with: MNIST.py
The MNIST dataset is loaded via tensorflow.keras.datasets
Images are normalised and flattened into 784-dimensional vectors before training
This project focuses on building neural network training logic from first principles to better understand:
- gradient descent
- backpropagation
- weight initialisation
- optimisation behavior
- feature learning
- Convolutional neural network implementation
- Batch normalisation
- Dropout regularisation
- Hyperparameter search
