Breast Cancer Prediction with Multilayer Perceptron

Overview

This project implements a Multilayer Perceptron (MLP) from scratch to predict breast cancer diagnoses. The MLP is trained on the Breast Cancer Wisconsin dataset, which contains features computed from digitized images of fine needle aspirate (FNA) of breast masses. The goal is to classify the tumors as malignant or benign based on these features.

Technical Aspects

Dataset

The Breast Cancer Wisconsin dataset is used for training and evaluating the MLP. The dataset contains 569 samples, each with 30 numeric features computed from digitized images of FNA of breast masses. The features describe characteristics of the cell nuclei present in the image. The target variable is the diagnosis, which is either malignant (M) or benign (B).

Model Architecture

The MLP is implemented using the following components:

Layer Class: Defines a single layer in the neural network, including initialization of weights and biases, forward and backward propagation, and update rules.
NeuralNetwork Class: Manages the layers, training process, and evaluation metrics. It includes methods for initializing layers, forward and backward propagation, gradient descent, and early stopping.
Activation Functions: Sigmoid and Softmax activation functions are used in the hidden and output layers, respectively.
Weight Initialization: Weights are initialized using uniform distribution or Xavier initialization.

Training Process

The training process involves the following steps:

Data Preprocessing: The features are normalized, and the targets are one-hot encoded.
Layer Initialization: The layers are initialized based on the specified architecture and weight initialization method.
Forward Propagation: The input data is passed through the layers to compute the output.
Backward Propagation: The error is computed, and gradients are propagated back through the network to update the weights.
Gradient Descent: The weights are updated using gradient descent with an optional momentum term.
Early Stopping: Training is stopped early if the validation loss does not improve for a specified number of epochs.

Evaluation Metrics

The model's performance is evaluated using the following metrics:

Binary Cross Entropy Loss: Measures the difference between the predicted and true distributions.
Accuracy: The proportion of correct predictions.
Precision: The proportion of true positive predictions among all positive predictions.
Recall: The proportion of true positive predictions among all actual positives.
F1 Score: The harmonic mean of precision and recall.
Confusion Matrix: A summary of prediction results for classification problems.

Usage

Installation

To run the project, ensure you have the following dependencies installed:

pip install -r requirements.txt

Training the Model

To train the model, use the following command:

python multi_layer_preceptron.py -train --learning_rate 0.03 --epochs 2500 --shape 10 10 --momentum 0.9 data/breast_cancer.csv

Making Predictions

To make predictions using the trained model, use the following command:

python multi_layer_preceptron.py -predict data/breast_cancer.csv

Results

Output

Training and Validation Metrics

Confusion Matrix

Visualizations

Correlation Heatmaps

The visualization.py script generates correlation heatmaps to visualize the relationships between features in the dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
data		data
images		images
.gitignore		.gitignore
README.md		README.md
layer.py		layer.py
multi_layer_preceptron.py		multi_layer_preceptron.py
neuralNetwork.py		neuralNetwork.py
requirements.txt		requirements.txt
utils.py		utils.py
visualization.py		visualization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Breast Cancer Prediction with Multilayer Perceptron

Overview

Table of Contents

Technical Aspects

Dataset

Model Architecture

Training Process

Evaluation Metrics

Usage

Installation

Training the Model

Making Predictions

Results

Output

Training and Validation Metrics

Confusion Matrix

Visualizations

Correlation Heatmaps

Before feature selection

After feature selection

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Breast Cancer Prediction with Multilayer Perceptron

Overview

Table of Contents

Technical Aspects

Dataset

Model Architecture

Training Process

Evaluation Metrics

Usage

Installation

Training the Model

Making Predictions

Results

Output

Training and Validation Metrics

Confusion Matrix

Visualizations

Correlation Heatmaps

Before feature selection

After feature selection

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages