This project implements a Multilayer Perceptron (MLP) from scratch to predict breast cancer diagnoses. The MLP is trained on the Breast Cancer Wisconsin dataset, which contains features computed from digitized images of fine needle aspirate (FNA) of breast masses. The goal is to classify the tumors as malignant or benign based on these features.
The Breast Cancer Wisconsin dataset is used for training and evaluating the MLP. The dataset contains 569 samples, each with 30 numeric features computed from digitized images of FNA of breast masses. The features describe characteristics of the cell nuclei present in the image. The target variable is the diagnosis, which is either malignant (M) or benign (B).
The MLP is implemented using the following components:
- Layer Class: Defines a single layer in the neural network, including initialization of weights and biases, forward and backward propagation, and update rules.
- NeuralNetwork Class: Manages the layers, training process, and evaluation metrics. It includes methods for initializing layers, forward and backward propagation, gradient descent, and early stopping.
- Activation Functions: Sigmoid and Softmax activation functions are used in the hidden and output layers, respectively.
- Weight Initialization: Weights are initialized using uniform distribution or Xavier initialization.
The training process involves the following steps:
- Data Preprocessing: The features are normalized, and the targets are one-hot encoded.
- Layer Initialization: The layers are initialized based on the specified architecture and weight initialization method.
- Forward Propagation: The input data is passed through the layers to compute the output.
- Backward Propagation: The error is computed, and gradients are propagated back through the network to update the weights.
- Gradient Descent: The weights are updated using gradient descent with an optional momentum term.
- Early Stopping: Training is stopped early if the validation loss does not improve for a specified number of epochs.
The model's performance is evaluated using the following metrics:
- Binary Cross Entropy Loss: Measures the difference between the predicted and true distributions.
- Accuracy: The proportion of correct predictions.
- Precision: The proportion of true positive predictions among all positive predictions.
- Recall: The proportion of true positive predictions among all actual positives.
- F1 Score: The harmonic mean of precision and recall.
- Confusion Matrix: A summary of prediction results for classification problems.
To run the project, ensure you have the following dependencies installed:
pip install -r requirements.txtTo train the model, use the following command:
python multi_layer_preceptron.py -train --learning_rate 0.03 --epochs 2500 --shape 10 10 --momentum 0.9 data/breast_cancer.csvTo make predictions using the trained model, use the following command:
python multi_layer_preceptron.py -predict data/breast_cancer.csvThe visualization.py script generates correlation heatmaps to visualize the relationships between features in the dataset.




