Skip to content

A fully-featured multilayer perceptron (MLP) implemented entirely from scratch in C++ for supervised learning tasks.

License

Notifications You must be signed in to change notification settings

javierespdev/neural-network-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural Network from Scratch in C++

A fully-featured multilayer perceptron (MLP) implemented entirely from scratch in C++ for supervised learning tasks. Supports both Mean Squared Error (MSE) and Cross-Entropy loss, online/offline training, and softmax outputs.

This project was developed as part of my Computer Engineering degree to learn and demonstrate the implementation of neural networks from the ground up.

XOR Regression Output
Visualization of a simple 2‑2‑1 neural network architecture, showing input, hidden, and output layers.

How the Network Works

This multilayer perceptron (MLP) implements the full forward and backward propagation pipeline from scratch. The steps are:

  1. Forward Propagation

    • Inputs are fed into the network.
    • Each hidden and output neuron computes a weighted sum of its inputs plus a bias.
    • Activation functions are applied:
      • Sigmoid for hidden layers
      • Sigmoid or Softmax for the output layer depending on the task
    • The network produces an output vector representing either:
      • Continuous values for regression tasks
      • Class probabilities for classification tasks
  2. Loss Calculation

    • The network computes the error between predicted outputs and target values.
    • Supported loss functions:
      • Mean Squared Error (MSE)
      • Cross-Entropy Loss
  3. BackPropagation

    • Gradients of the loss with respect to each weight are computed using the chain rule.
    • Errors are propagated from output to input layers.
    • Weights are updated according to:
      • Learning rate (η)
      • Momentum (μ) if enabled
    • Supports online (per-sample) or offline (batch) weight updates
  4. Training Loop

    • Repeat forward → loss → backward for a number of iterations or until convergence.
    • Save best-performing weights for evaluation or prediction.

Command-Line Parameters

Flag Argument Required Description
-t <file> Yes Path to the training dataset.
-T <file> No Path to the test dataset. If omitted, training set is used for testing.
-l <int> No Number of hidden layers. Default: 1.
-h <int> No Neurons per hidden layer. Default: 4.
-i <int> No Maximum number of training iterations. Default: 1000.
-e <float> No Learning rate (η). Default: 0.7.
-m <float> No Momentum coefficient (μ). Default: 1.0.
-f <int> No Loss function: 0 = MSE, 1 = Cross-Entropy. Default: 0.
-n No Enable input normalization (scale to [-1, 1]).
-o No Enable online training (weight updates per sample).
-s No Use softmax activation in the output layer.
-p No Prediction mode (Kaggle-style). Requires -w to load saved weights.
-w <file> No Save trained model weights to a file.

Dataset Format

The network expects the dataset in a simple text format (.dat). Each file should contain numeric values only.

File Structure

  1. First line: Network structure
<number_of_inputs> <number_of_outputs> <number_of_hidden_neurons_per_layer>
  1. Following lines: Input and target values
  • Each row contains input values followed by target values.
  • Example:
1 -1 1 0
-1 -1 0 1
-1 1 1 0
1 1 0 1
  • The first N numbers are inputs
  • The last M numbers are targets

Guidelines

  • No headers should be included; numeric values only.
  • You can enable input normalization with the -n flag to scale inputs to [-1, 1].
  • Supports multi-output for regression or classification tasks.

Example: Solving the XOR Problem

The XOR problem is a classic example of a non-linearly separable function. A simple perceptron cannot solve it, but a small MLP with one hidden layer and two neurons can learn it due to non-linear activation functions.

XOR Dataset

Input 1 Input 2 Target
0 0 0
0 1 1
1 0 1
1 1 0

Regression Mode (Sigmoid Output)

The network predicts continuous outputs using a sigmoid activation on the output neuron.

XOR Regression Output
Sigmoid predictions for XOR regression. The model successfully separates the non-linear classes.

Convergence on regression

XOR Classification Output
Training convergence for regression: MSE decreases over iterations for both training and test datasets.

Classification Mode (Softmax Output)

For classification, the output layer uses softmax activation to provide class probabilities.

XOR Classification Output
Softmax probabilities for XOR classification. The model successfully separates the non-linear classes.

Convergence on classification

XOR Classification Output
Training convergence for classification: Cross-Entropy improves and stabilizes as training progresses.

This demonstrates the network's ability to learn non-linear mappings, a key feature of multilayer perceptrons.

Running the Multilayer Perceptron

Follow these steps to compile and run the neural network:

1. Clone the repository and compile the project

The repository includes a Makefile. Simply run:

git clone https://github.com/javierespdev/neural-network-from-scratch.git
cd neural-network-from-scratch
make

2. Train the network

Run the executable with the training dataset:

./bin/mlp -t data/train_xor.dat

By default, the network will train with 1 hidden layer and 4 neurons per layer. You can adjust parameters like learning rate, number of iterations, or hidden layers using command-line flags.

3. Example: Using custom options

./bin/mlp -t data/train_xor.dat -l 2 -h 3 -i 5000 -e 0.5 -f 1 -s -o
  • -l 2 → 2 hidden layers
  • -h 3 → 3 neurons per hidden layer
  • -i 5000 → 5000 iterations
  • -e 0.5 → learning rate 0.5
  • -f 1 → Cross-Entropy loss
  • -s → Softmax output
  • -o → online training mode

4. Prediction mode

Once trained, you can save weights and use them for prediction:

# Save trained weights
./bin/mlp -t data/train_xor.dat -w weights.bin

# Predict with saved model
./bin/mlp -p -w weights.bin -T data/test_xor.dat

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A fully-featured multilayer perceptron (MLP) implemented entirely from scratch in C++ for supervised learning tasks.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published