Generative Approaches to CIFAR10 Colourization

This repository showcases two approaches to the coulourization task of CIFAR10 images: Auto Encoder U-Net and Conditional GANs. Over the last decade, the process of automatic colorization has been studied thoroughly due to its vast applications such as colourization of grayscale images and restoration of aged and/or degraded images. This problem is highly ill-posed due to the extremely large degrees of freedom during the assignment of color information. In this approach, I attempted to fully generalize this procedure using a conditional Deep Convolutional Generative Adversarial Network (DCGAN). The network is trained over a dataset that is publicly available, CIFAR10.

U-Net Architecture

The GAN architecture uses a U-Net, introduced by (Ronneberger et al., 2015, like fully convolutional architecture (with concatenation of opposite layers) for the generator, and the G loss function is L1 regularized, which produces an effect where the generator is forced to produce results that are similar to the ground truth on the pixel level. This will theoretically preserve the structure of the original images and prevent the generator from assigning arbitrary colors to pixels just to fool the discriminator. The generator takes the grayscaled image as an input, while the discrimator takes either the original image or generated image plus the condition, which is the grayscaled image in this case.

The U-Net architecture used for both the Auto Encoder and generator can be represented with this graph. The gray arrows represent the skip connections from encoder to the mirroring decoder layer.

DCGAN

Auto Enconder

Results

GAN generated images had a clear visual improvement over those generated by the U-Net. These contained colors that were more vibrant whereas the results from U-Net suffered from a light hue and phenomenon called "sepia effect". In some cases, the GAN was able to nearly replicate the ground truth. It was even able to colorize reasonably well on one of the images where the ground truth was grayscale. Nonetheless, both models where trained for only 20 epochs, and it appeared that the loss function continued to have a negative gradient for the last epoch, so they probably need more time to converge.

m4mbo/recolor-cifar10

Generative Approaches to CIFAR10 Colourization

U-Net Architecture

DCGAN

Auto Enconder

Results