This repository contains a simple neural network that solves the XOR-Problem. The network was created using keras
and tensorflow
. Furthermore, a simple program is provided that allows the user to test the trained model. The environment.yml
makes sure that the user can recreate the conda environment in which this example was build.
In addition to the keras model, a tensorflow implementation the same network can be found in the notebook 2_tensorflow_implementation
in the notebooks/
folder.
-
Create a virtual environment by using one of the following options:
Conda environment
Use theMakefile
and runmake xx_environment
to recreate the environment this expample was build in, whereas thexx
stands either forosx_
orlinux_
due to some build-issues (see notes).Pipenv
APipfile
and aPipfile.lock
are provided if you prefer to usepipenv
. Use the command$ pipenv shell
in the project directory. Then execute$ pipenv sync
to install all missing packages.requirements.txt
If you prefer to use a plainrequirements.txt
to create a virtual environment usingvirtualenv
. -
Run
make train
to train a simple neural network created in keras, consisting of one hidden dense layer with 16 neurons. The model should be saved in the directorymodels/
-
Run
make test
to load the trained model. A command-prompt should appear showinginput>
. -
Enter a command of the shape
x, y
whereasx
andy
map to two input sources that are either0
or1
, i.einput> 1, 0
. -
An output prompt should appear, showing either
0
or1
depending on the input, i.e.output> 0.0
. -
Enter
input> exit
to quit the program.
Which network architecture was used?
I used a simple feed-forward neural network with one hidden dense layer using a rectified linear unit (RelU) activation. The hidden layer contains 16 neurons. The final output-layer is also a dense layer with one neuron, activated using a sigmoid function which maps the input of the hidden layer to a ouput between 0
and 1
, indicating a probablity. The input of the network is a numpy array containing four samples (every way to combine the numbers 0
and 1
), whereas each sample was of dimension 2
. So the shape of the input was (4,2)
.
The minimum number of layers required
The short answer to this question is that one hidden layer, one input and one output layer are required to get valid results.
However, in the notebook/
-folder, the notebook 1_xor_prototype_keras.ipynb
shows several experiments with different model architectures. It can be stated that more complex architectures, such as an increased number of hidden layers and/or an increased number of neurons per layer can certainly improve the performance of the model in a way that fewer epochs are required to get a higher accuracy.
- This example was created under Linux (Ubuntu 18.04 LTS) and tested on a Mac. Due to some conda-issues, two separate
environment.yml
-files were created. If you use Linux, use thelinux_environment.yml
, andosx_environment.yml
if you use MacOS. - This example uses Python 3.7