/multilayerXOR

A Neural Network that emulates the logical gates such as the XOR or the CSWAP

Primary LanguageC++

Multilayer Perceptron

A couple of years ago I had to come up with a project for a course and decided to code a Neural Network in C++.

This repository is a slightly brushed up version of this project, it makes use of the original Neural Network class written for the course (header/NeuralNetwork.h) and I have added python scripts for conveniently generating data and plotting errors.

The pipeline

  • creates training and testing samples of the XOR or CSWAP gate (a 3-bit gate)
  • fits the model to the training data and evaluates the performance on the test set.
  • creates a plot of the error measure

Logical Gates | Neural Network | Pipeline

Logical gates

The Neural Network (header/NeuralNetwork.h) can be initialized with different numbers of neurons in each layer. This invites to experiment with more exotic types of gates - but first:

XOR

Logical gates such as e.g. AND, OR can typically modeled by a single perceptron. The exclusive OR (XOR) is an exception to that as it is not linearly separable and a Neural Network requires a hidden layer to emulate it.

The XOR returns True if the inputs are distinct and False otherwise:

Input 1 Input 2 Output
0 0 0
0 1 1
1 0 1
1 1 0

CSWAP

The Controlled SWAP (CSWAP) gate or Fredkin gate (wikipedia) has 3 input and 3 output bits and can be used in Quantum Computing. It works as follows:

If the C-bit is non-zero, Input 1 and Input 2 are switched. This is conveyed in the following table

C Input 1 Input 2 C Output 1 Output 2
0 0 0 0 0 0
0 0 1 0 0 1
0 1 0 0 1 0
0 1 1 0 1 1
1 0 0 1 0 0
1 0 1 1 1 0
1 1 0 1 0 1
1 1 1 1 1 1

Neural Network

The Neural Network has 3 layers: input, output and a hidden layer*. The numbers of neurons per layer is variable and the number of input/output neurons will be inferred from the data provided.

For the XOR gate, the network will have 2 inputs and 1 output. In the case of the CSWAP gate the network will have 3 inputs and 3 outputs. The number of hidden neurons is 2 by default and should be adjusted upwards when training the model on CSWAP data.

As activation function the hyperbolic tangent function is used: tanh(x)

Loss: Root Mean Square error (RMS):

equ

where the index i runs over all neurons of the layer (here that is only one).

At every step a smoothed error measurement (RAE) is put out, recursively it can be defined like this:

equ

with the smoothing factor xi. The intuition here is: the smoothing factor adjusts how much a single loss value contributes to the RAE

equ

By default the training goes over 30 epochs but the loop is broken as soon as the RAE drops below a certain threshold (0.0001).

Pipeline

Generating data

g++ -o data_gen data_gen_XOR.cpp
./data_gen

or

g++ -o data_gen data_gen_CSWAP.cpp
./data_gen

by default this creates 250 training samples and 100 test samples.

Training

g++ -o training training.cpp
./training

The default settings are

  • learning rate (eta) = 0.08
  • momentum (alpha) = 0.1

Note: When training on the CSWAP data, the Neural Network will automatically initialized with 3 Input and 3 Output neurons, however 2 hidden neurons appear to be insufficient to train the network!

Plotting the error curve

The plotting script utilizes numpy and matplotlib:

pip install numpy
pip install matplotlib

then run

python plot.py 

to obtain a plot like this:

Image of RAE plot