/data-augmentation-with-gan-and-vae

Using the UTK Faces Dataset

Primary LanguageJupyter NotebookMIT LicenseMIT

data-augmentation-with-gan-and-vae 💯

Vincent Fortin and I are using the UTK Faces dataset to for the project in the Machine Learning I project.

Unbalanced classes is one of the most frequent struggle when dealing with real data. Is it better to down/upsample, or do nothing at all? Another approach is to generate samples resembling the smallest class. In this project, we are using Variational AutoEncoders (VAEs) and Generative Adversarial Networks (GANs) to generate samples of the smallest class. Using human faces, we will determine if a convolutional neural network (CNN) will be trained better with generated samples, or without.

PROGRESS

  1. First we trained a VAE to generate human faces
  2. Then we trained a ConvNet with Pytorch but it didn't work.
  3. So we tried with Keras to see if our architecture was the problem. It's not. We reached 90% accuracy.
  4. Here is the adversarial auto encoder.

TO DO

  • Train a Keras Model
  • Create a GAN to generate human faces
  • Explore other generative methods
  • Train CNNs to see if the accuracy is better with the generative methods
  • Fix the Pytorch CNN

PROJECT PLAN

  1. Create various sample generators
  2. Establish a benchmark CNN classifier, trained with 10% of the female samples (smaller class)
  3. Train classifiers on 10% of the female samples, and add generated samples. Finally, compare performance.
    • VAE
    • GAN
    • other
  4. Compare performance, plot

MISC

  • Try random erasing

Example of the Adversarial Auto Encoder Learning

Alt Text

This is the output (generated faces) of the adversarial autoencoder.