Neural Network from Scratch in C++

A fully-featured multilayer perceptron (MLP) implemented entirely from scratch in C++ for supervised learning tasks. Supports both Mean Squared Error (MSE) and Cross-Entropy loss, online/offline training, and softmax outputs.

This project was developed as part of my Computer Engineering degree to learn and demonstrate the implementation of neural networks from the ground up.

Visualization of a simple 2‑2‑1 neural network architecture, showing input, hidden, and output layers.

How the Network Works

This multilayer perceptron (MLP) implements the full forward and backward propagation pipeline from scratch. The steps are:

Forward Propagation
- Inputs are fed into the network.
- Each hidden and output neuron computes a weighted sum of its inputs plus a bias.
- Activation functions are applied:
  - Sigmoid for hidden layers
  - Sigmoid or Softmax for the output layer depending on the task
- The network produces an output vector representing either:
  - Continuous values for regression tasks
  - Class probabilities for classification tasks
Loss Calculation
- The network computes the error between predicted outputs and target values.
- Supported loss functions:
  - Mean Squared Error (MSE)
  - Cross-Entropy Loss
BackPropagation
- Gradients of the loss with respect to each weight are computed using the chain rule.
- Errors are propagated from output to input layers.
- Weights are updated according to:
  - Learning rate (η)
  - Momentum (μ) if enabled
- Supports online (per-sample) or offline (batch) weight updates
Training Loop
- Repeat forward → loss → backward for a number of iterations or until convergence.
- Save best-performing weights for evaluation or prediction.

Command-Line Parameters

Flag	Argument	Required	Description
`-t`	`<file>`	Yes	Path to the training dataset.
`-T`	`<file>`	No	Path to the test dataset. If omitted, training set is used for testing.
`-l`	`<int>`	No	Number of hidden layers. Default: `1`.
`-h`	`<int>`	No	Neurons per hidden layer. Default: `4`.
`-i`	`<int>`	No	Maximum number of training iterations. Default: `1000`.
`-e`	`<float>`	No	Learning rate (η). Default: `0.7`.
`-m`	`<float>`	No	Momentum coefficient (μ). Default: `1.0`.
`-f`	`<int>`	No	Loss function: `0 = MSE`, `1 = Cross-Entropy`. Default: `0`.
`-n`	—	No	Enable input normalization (scale to [-1, 1]).
`-o`	—	No	Enable online training (weight updates per sample).
`-s`	—	No	Use softmax activation in the output layer.
`-p`	—	No	Prediction mode (Kaggle-style). Requires `-w` to load saved weights.
`-w`	`<file>`	No	Save trained model weights to a file.

Dataset Format

The network expects the dataset in a simple text format (.dat). Each file should contain numeric values only.

File Structure

First line: Network structure

<number_of_inputs> <number_of_outputs> <number_of_hidden_neurons_per_layer>

Following lines: Input and target values

Each row contains input values followed by target values.
Example:

The first N numbers are inputs
The last M numbers are targets

Guidelines

No headers should be included; numeric values only.
You can enable input normalization with the -n flag to scale inputs to [-1, 1].
Supports multi-output for regression or classification tasks.

Example: Solving the XOR Problem

The XOR problem is a classic example of a non-linearly separable function. A simple perceptron cannot solve it, but a small MLP with one hidden layer and two neurons can learn it due to non-linear activation functions.

XOR Dataset

Input 1	Input 2	Target
0	0	0
0	1	1
1	0	1
1	1	0

Regression Mode (Sigmoid Output)

The network predicts continuous outputs using a sigmoid activation on the output neuron.

Sigmoid predictions for XOR regression. The model successfully separates the non-linear classes.

Convergence on regression

Training convergence for regression: MSE decreases over iterations for both training and test datasets.

Classification Mode (Softmax Output)

For classification, the output layer uses softmax activation to provide class probabilities.

Softmax probabilities for XOR classification. The model successfully separates the non-linear classes.

Convergence on classification

Training convergence for classification: Cross-Entropy improves and stabilizes as training progresses.

This demonstrates the network's ability to learn non-linear mappings, a key feature of multilayer perceptrons.

Running the Multilayer Perceptron

Follow these steps to compile and run the neural network:

1. Clone the repository and compile the project

The repository includes a Makefile. Simply run:

git clone https://github.com/javierespdev/neural-network-from-scratch.git
cd neural-network-from-scratch
make

2. Train the network

Run the executable with the training dataset:

./bin/mlp -t data/train_xor.dat

By default, the network will train with 1 hidden layer and 4 neurons per layer. You can adjust parameters like learning rate, number of iterations, or hidden layers using command-line flags.

3. Example: Using custom options

./bin/mlp -t data/train_xor.dat -l 2 -h 3 -i 5000 -e 0.5 -f 1 -s -o

-l 2 → 2 hidden layers
-h 3 → 3 neurons per hidden layer
-i 5000 → 5000 iterations
-e 0.5 → learning rate 0.5
-f 1 → Cross-Entropy loss
-s → Softmax output
-o → online training mode

4. Prediction mode

Once trained, you can save weights and use them for prediction:

# Save trained weights
./bin/mlp -t data/train_xor.dat -w weights.bin

# Predict with saved model
./bin/mlp -p -w weights.bin -T data/test_xor.dat

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
data		data
include		include
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neural Network from Scratch in C++

How the Network Works

Command-Line Parameters

Dataset Format

File Structure

Guidelines

Example: Solving the XOR Problem

XOR Dataset

Regression Mode (Sigmoid Output)

Convergence on regression

Classification Mode (Softmax Output)

Convergence on classification

Running the Multilayer Perceptron

1. Clone the repository and compile the project

2. Train the network

3. Example: Using custom options

4. Prediction mode

License

About

Uh oh!

Releases

Packages

Languages

License

javierespdev/neural-network-from-scratch

Folders and files

Latest commit

History

Repository files navigation

Neural Network from Scratch in C++

How the Network Works

Command-Line Parameters

Dataset Format

File Structure

Guidelines

Example: Solving the XOR Problem

XOR Dataset

Regression Mode (Sigmoid Output)

Convergence on regression

Classification Mode (Softmax Output)

Convergence on classification

Running the Multilayer Perceptron

1. Clone the repository and compile the project

2. Train the network

3. Example: Using custom options

4. Prediction mode

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages