/Semantic-Segmentation

Using UNet Model and Jupyter Notebook

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Semantic Segmentation

Using UNet Model and Jupyter Notebook

Image Mask
1e6f48393e17_03 1e6f48393e17_03_mask

Installation

  1. Create conda environment
conda create --name env-name gitpython
  1. Clone Github
from git import Repo
Repo.clone_from("https://github.com/ihamdi/Semantic-Segmentation.git","/your/directory/")

       or download and extract a copy of the files.

  1. Install PyTorch according to your machine. For example:
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
  1. Install dependencies from requirements.txt file:
pip install -r requirements.txt
  1. Download Data:

       Run python scripts/download_data.py to download the data using the Kaggle API and extract it automatically. If you haven't used Kaggle API before, please take a look at the instructions at the bottom on how to get your API key.

       Otherwise, download the files from the official Carvana Image Masking Challenge page and extract "train_hq.zip" to imgs and "train_masks.zip" to masks folders in the data directory.

Folder Structure

  1. data directory contains imgs and masks folders.

       i. imgs subfolder is where the images are expected to be.

       ii. masks subfolder is where the images are expected to be.

  1. scripts directory contains download_data.py used to download the dataset directly from Kaggle.
  2. unet directory contains UNet model.
  3. utils directory contains data-loading and dice-score files.

Dataset

Data is obtained from Kaggle's Carvana Image Masking Challenge competition. Images and Masks archieves are provided in both normal and high quality. This code utilizes the train_hp.zip as well as train_masks.zip.

There are 318 cars in the train_hq.zip archieve. Each car has exactly 16 images, each one taken at different angles. In addition, each car has a unique id and images are named according to id_01.jpg, id_02.jpg ... id_16.jpg.

How to use

Run the following command

python train.py

The program by default will train with 5 epochs with a batch size = 1, learning rate = 0.00001, num workers = 0, scale = 0.5, mixed precision enabled, and uses 10% of the dataset for validation. You can pass the following arguments to change the default values:

  1. Epochs: --epochs
  2. Batch Size: --batch-size
  3. Learning Rate: --learning-rate
  4. Subset Size: --sample-size
  5. Number of Workers: --num-workers
  6. Image Scale: --scale
  7. Percentage used as Validation: --validation
  8. Mixed Precision: --amp

For example:

python train.py sample-size 500 num-workers 15 --amp

Results

Dice score is printed at the end of every validation round. However, the program uses Weights and Biases to log training loss and accuracy as well as the dice score. This makes it quite easy to visualize the results and check the status of runs without being at the training machine.

Changes made to Original Code

  1. Fixed data download problem from Kaggle. The code no longer gives an "unauthorized" error.
  2. Introduced sample_size to enable reduction of dataset if needed (mainly used for testing).
  3. Added num_workers as a variable so it doesn't need to be changed manually inside train.py.
  4. Added sample_size and num_workers to logging.
  5. Added sample_size and num_workers to arguments so they can be set easily when calling python train.py.
  6. Changed RMSProp to Adam optimizer for better results.
  7. Fixed progress bar for training. Original code show a fixed number of total iterations even if batch size is changed. The update was also choppy and only happening every 2 iterations.
  8. Fixed validation loop. Now it runs at the end of each epoch.
  9. Removed commented out lines and unused import statements.

Background:

This was created to learn about semantic segmentation and UNet, therefore only the training data is utilized.


Contact:

For any questions or feedback, please feel free to post comments or contact me at ibraheem.hamdi@mbzuai.ac.ae


Referernces:

Pytorch-UNet was used as base for this code.

U-Net: Convolutional Networks for Biomedical Image Segmentation by Olaf Ronneberger, Philipp Fischer, Thomas Brox


*Getting Key for Kaggle's API

image

image

image