Melanoma Experiments + Weights and Biases Integration

Welcome to the repository! If you don't know what this repository is about, don't worry! That's what this README is for. This repository contains all working code for the SIIM-ISIC Melanoma Classification kaggle competition that should get you in the top-5% of the leaderboard!

I used W&B heavily to track all my experiments, run hyperparameter sweeps, store datasets as W&B tables, store model weights as model artifacts after every epoch and also use the embedding projector to interpret what model learned.

I have also written four reports with detailed explanations to explain each step:

You can also find an example dashboard that this GitHub repository creates here - Melanoma W&B Dashboard.

Downloading the Dataset

To download the dataset, we can use the Kaggle API. Please run the following line of code to get the Melanoma dataset:

kaggle competitions download -c siim-isic-melanoma-classification

Create folds

Once you've downloaded the datase, we will create training and validation folds. Run the following line of code to do that: python folds.py

Data Preprocessing

For a detailed explanation on data preprocessing - please refer to the report How to prepare the dataset for the Melanoma Classification?.

The line of code that you want to run once you've downloaded the dataset is below:

python resize_images.py --input_folder <path to input data> --output_folder <path to output folder> --cc --mantain_aspect_ratio --sz 256

Model Training

Once you have downloaded and pre-processed the dataset, we are now ready to perform model training. The training script automatically logs all experiments to Weights and Biases.

Please run the following line of code to kick-off model training. (Note that you will need a machine with a GPU to run model training)

python train.py --model_name efficient_net --arch_name efficientnet-b0 --device cuda --metric 'auc' --training_folds_csv /home/arora/git_repos/melanoma_wandb/data/train_folds.csv --train_data_dir /home/arora/git_repos/melanoma_wandb/data/usr/resized_train_256_cc --kfold 0 --pretrained imagenet --train_batch_size 64 --valid_batch_size 64 --learning_rate  5e-4 --epochs 10 --sz 224 --loss 'weighted_focal_loss'

Kick off Hyperparameter Sweep

To kick off hyperparameter sweep:

  1. Go to the W&B project page that's created when you first kick off training.
  2. Go to W&B Sweeps.
  3. Go to "Create Sweep"
  4. Define sweep parameters and click on "Initialize Sweep"

That should create a nice looking sweep dashboard with all hyperparameter values and validation metric scores as in the example dashboard here - W&B Melanoma Sweep.