This repository contains all projects I finished as part of Udacity's deep learning nanodegree.
In my first project I developed a simple neural network from scratch in raw numpy to estimate the most optimal price per unit for a bike store, in relation to the bike's prices and the amount of clients.
Next I implemented an image classifier as a convolutional neural network inspired by alexnet's architecture which was trained on the CIFAR10 dataset, achieving a 70% accuracy.
It consisted of a classical architecture, defined by two convolutions with dropout layers and two fully connected layers.
Around 2011, a good classification error rate was 25%. In 2012, a Alex Krizhevsky's convolutional neural net achieved 16%; in the next couple of years, error rates fell to a few percent. By 2015, researchers reported that software exceeded human ability at narrow visual tasks.
Diving deeper into recurrent neural networks, I designed a sequence to sequence model, with encoders and decoders implemented as LSTM cells, lookup tables and embeddings to boost performance and clean the data, filtering out the noise.
LSTM stands for long-short term memory; an LSTM cell allows the model to keep track of previously processed data. The cell processes an input and a previously stored value, which comes as an output to a sigmoid function, if the value given as output is equal to zero the current input won't be processed since they both are being multiplied.
It will pass the information as long as it is relevant, using an attention layer that filters out the words whose probability to appear is lowest, in order to keep track of context. The generated scripts are still non-sensical, with some days of training and more data it would have given more impressive results, the adjustment of the hyperparameters was enough for the model to offer a smooth drop in the loss function, and RNN's usually run better with three layers but any more than that will result in a plateu effect according to Ian Goodfellow's papers. CNN's on the contrary give room for as many layers as are necessary.
By far my favourite project, a sequence to sequence model was trained to translate from english to french, the accuracy went well beyond 90% giving really impressive results. More data would have made it easier to improve the model.
Generative adversarial networks or GAN's consist of two convolutional neural networks a generator and a discriminator which as part of a probabilistic model give us machine-generated results.
The idea underlying this concept is that the generator creates data whilst the discriminator tries to classify it as fake or real data, as both generator and discriminator get better at their tasks, more accurate outputs are being given. A sigmoid cross entropy function is being used for both losses to drop. In my project I designed a GAN trained on celebrities' pictures that eventually learnt to generate faces.
GAN's follow an architecture similar to that of RNN's but the convolutions in this case are ordered by increasing size.