/CIFAR-10-Image-Colorization

This repository explores the vibrant world of CIFAR-10 image colorization with Convolutional Neural Networks (CNNs) and enhance the results by applying skip connections and residual blocks.

Primary LanguageJupyter Notebook

CIFAR-10-Image-Colorization

Explore the vibrant world of CIFAR-10 image colorization with Depp Learning.

Image colorization is the process of adding color to grayscale images, a fascinating task that can be effectively addressed using deep learning techniques. This project focuses on developing and training deep learning models to automatically colorize grayscale images, leveraging the renowned CIFAR-10 dataset.

Table of Contents

Introduction

In this project, we implement and train convolutional neural network (CNN) models for image colorization.The main notebook also contains a U-net model for implementing skip connection technique, additionally to enhance the performance and accuracy of colorizing residual blocks were applied to the U-net model. Here's an overview of the key components and features of the project:

  • model.py: This module contains the implementation of various CNN models for image colorization. You can choose from different model architectures, including a basic CNN, a U-Net, and a ResNet. These models are designed to predict color categories for grayscale input images.

  • torch_helper.py: This helper module provides a set of functions for data processing, data loading, and training. It simplifies tasks like generating data batches, converting data to PyTorch tensors, calculating loss, and running the validation step during training.

  • train.py: The primary training script allows you to train image colorization models. It utilizes the models defined in model.py, handles the CIFAR-10 dataset, and executes the training loop. You can customize various training parameters to suit your specific needs.

  • utils.py: A collection of utility functions used to load and preprocess CIFAR-10 data, as well as to visualize results. These utilities assist with data loading, data preprocessing, and visualizing image colorization outputs.

Dataset

For this task, we utilized the CIFAR-10 dataset, specifically the "automobile" class. The CIFAR-10 dataset contains 60,000 color images with dimensions of 32x32 across 10 classes. The dataset is split into 50,000 training images and 10,000 test images. We convert the original RGB images to grayscale images for input and create labels corresponding to the color of each pixel within each input image.

Getting Started

To get started with this project, you'll need Python and PyTorch installed on your system. We recommend using Anaconda or virtual environments to manage your Python environment. Ensure you have a GPU available for training, as the tasks may require significant computational resources. If you're using Colab, remember to enable GPU support.

Model Training

Several models were implemented with bellow training parameters:

  • Number of Filters (NF)
  • Learning Rate (LR)
  • Kernel Size
  • Number of Epochs

Base Model (CNN)

The base model is initially trained to address the image colorization task. The network architecture is as follows:

image Base Model Architecture

Custom U-Net Model

We introduce skip connections to improve the model's performance. The architecture is based on U-Net and includes skip connections.

image

Custom U-Net Architecture

U-Net with Residual Block

An extra point task involves adding Residual Blocks to DownConv, UpConv, and Bottleneck layers.

image Residual Block

Results and Visualization

image
colorized figures

image

value of loss during training CNN model

image

value of loss during training U-net model

image

value of loss during training ResNet model

References

  • Olaf Ronneberger, Philipp Fischer, Thomas Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015.
  • Stanford University Convolutional Neural Networks Tutorial