Image-Dehazing-using-CycleGAN

OBJECTIVE:

The main objective of our problem is to dehaze the hazed image using CycleGAN, with unpaired data for hazed and dehazed images, with just the limited data.

DATASET:

https://drive.google.com/drive/folders/1ytLc1atUNpvauXHHXCQrN9vpw_dsGSow?usp=sharing

This dataset contains 51 hazed and 56 dehazed images.

GAN OVERVIEW:

image

X is the random instance from real data, z is the random instance from random samples, G(z) is the generated output from random sample data. Discriminator’s output can be represented in two ways based on the type of the input it takes. If it takes real data as input, its output can be written as D(x) when the input data is the output of the generator; this can be written as D(G(z)).

Discriminator tries to maximize the probability that it predicts correctly, in other way, it tries to make D(G(z)) near to 0. Whereas the generator tries to minimize the probability that the discriminator predicting is correct, in other way, it tries to make D(G(z)) near to 1.


Applications of GAN:

GAN has various number of use cases in real life scenarios,

  • Generate new data samples from available data that is not similar to a real one.
  • Generate realistic pictures of people (Deepfakes) that have never existed.
  • Generate not only image it can generate text, poems, musics, songs etc.,
  • Generate captions for images and videos by text to image generation using (ObjectGAN and Object Driven GAN).
  • Generate anime characters in Game Development and animation production on its own.
  • Translate one image to another without disturbing the background. (In this project we have translated hazed images into dehazed images)
  • Sometimes also used to increase the resolution of the image or video (SR-GAN and ESR-GAN).

CycleGAN:

The CycleGAN is an extension of the GAN architecture that involves the simultaneous training of two generator models and two discriminator models. One generator takes images from the first domain (Hazed Images) as input and outputs images for the second domain (Dehazed Images), and the other generator takes images from the second domain (Dehazed Images) as input and generates images from the first domain (Hazed Images). Discriminator models are then used to determine how close the generated images are to the original image and update the generator models accordingly.

Usually, GAN requires a dataset of paired examples to train an image-to-image translation model. It’s difficult to get pairwise datasets in real world datasets. In many cases such a dataset does not exist. Unpaired image-to-image translation is successfully done by CycleGAN. CycleGAN is well known for its application of image-to-image translation in the absence of paired data.

image


In this project we have two different collections of data (Hazed image and Dehazed image). For this we develop an architecture of two GANs, and each GAN has a discriminator and a generator model, meaning there are four models ( Generator A2B, Generator B2A, Discriminator B and Discriminator A) in total in the architecture. To summarize,

  • GAN 1: Translates dehazed Images to hazed Images.

  • GAN 2: Translates hazed Images to dehazed Images.

We can summarize the generator and discriminator models from GAN 1 as follows:

Generator Model 1:

Input: dehazed Images.

Output: Generated hazed Images.

Discriminator Model 1:

Input: hazed Images and output from Generator Model 1.

Output: Likelihood of image is a hazed Image.

Similarly, we can summarize the generator and discriminator models from GAN 2 as follows:

Generator Model 2:

Input: hazed Images.

Output: Generated hazed Images.

Discriminator Model 2:

Input: dehazed Images and output from Generator Model 2.

Output: Likelihood of image is a dehazed Image.


Implementation of Cycle GAN for single image dehazing:

The the data is preprocessed in the way all the images are of the same shape i.e. (256,256,3).

Input Image:

image

Generator:

image

Discriminator:

image

The parameters of the model are updated using:

  • Adversarial loss (L2 or mean squared error)

  • Identity loss (L1 or mean absolute error)

  • Forward cycle loss (L1 or mean absolute error)

  • Backward cycle loss (L1 or mean absolute error)


Evaluation metrics:

A haze removal algorithm’s performance can be evaluated on several factors, among them, two of the most frequently used factors are the PSNR and SSIM. Peak Signal to Noise Ratio (PSNR) measures the ability of the algorithm to remove noise from a noisy image.

SSIM is also a full-reference evaluation method, which measures the similarity between the clean image and the image to be evaluated in terms of brightness, contrast, and structure. Two identical images will have a SSIM of 1.


OUTPUT:

image image


Conclusion:

The trained GAN was tested on the test dataset with 10 images. The average Peak Signal-to-Noise Ratio is 49.39523569512697 and Structural Similarity Index Measure is 0.8916414893280891. Our model works pretty well for dehazing the image without any loss in its spatial properties while removing the haze (noise), with just 107 training samples.