For this project, I use DCGAN for generating anime faces, trained on the anime faces dataset https://www.kaggle.com/datasets/splcher/animefacedataset.
I also test out the DCGAN architecture for generating handwritten digits based on the MNIST dataset.
Model architecture used in the DCGAN paper https://arxiv.org/pdf/1511.06434.pdf:
The loss function for GAN, also known as the min-max loss:
The generator
The generator learns to approximate the distribution of the actual training data, and then it samples from this learned distribution. While training, the generator and discriminator will both train alternatively while keeping the other one fixed.
The generator takes in a random normal distribution (
The discriminator tries to evaluate the output generated by the generator and outputs a value between 0 and 1. If the value is close to 0, it predicts the generated samples as fake, and if the value is close to 1, then the generated sample is predicted as real.
The discriminator essentially is a classifier that performs binary classification, and the loss function is the binary cross entropy loss.
Where,
When the label is 1, the input is x from the training data, and the loss function becomes:
When the label is 0, the input is G(z) from the generated sample, and the loss function becomes:
The generator will try to fool the discriminator to classify its output as real. The loss function for the generator is thus:
The generator will try to minimize this function.