/Camelyon17

Camelyon17 (Breast Tumor Classification)

Primary LanguagePythonMIT LicenseMIT

1. Camelyon17

1-1) About

This project is the CNN model to diagnose breast cancer. It consists of a CNN model, Desenet and ResNet. The criteria for performance is AUC(Area Under the Curve). An output of model is a float number from 0 to 1. (0: Normal, 1: Tumor)

1-2) Architecture

models/densenet.py
models/resnet.py

2. Dataset

2-1) Overview

CAMELYON17 is the second grand challenge in pathology organised by the Diagnostic Image Analysis Group (DIAG) and Department of Pathology of the Radboud University Medical Center (Radboudumc) in Nijmegen, The Netherlands. The data in this challenge contains whole-slide images (WSI) of hematoxylin and eosin (H&E) stained lymph node sections. All ground truth annotations were carefully prepared under supervision of expert pathologists. For the purpose of revising the slides, additional slides stained with cytokeratin immunohistochemistry were used. If however, you encounter problems with the dataset, then please report your findings at the forum.

utils.py
dataset_train.py
dataset_eval.py

2-2) Data Argumentation

convert images to horizontal flip
convert images to vertical flip
convert images to gray scale randomly (percentage = 10%)
convert images brightness, contrast, saturation, hue slightly

2-3) Mask

Using several masks, patch is extracted from them with mask inclusion ratio(hyperparameter)

  • Tissue Mask
  • Tumor Mask
  • Normal Mask

2-4) Hard Mining

Difficult train dataset which predicted incorrectly several times is collected in csv file. Net is trained with combination of difficult train dataset and original train dataset.

3. Train

train.py

3-1) Optimizer

Stochastic Gradient Descent Optimizer

3-2) Loss Function

Binary Cross Entropy Loss (torch.nn.BCELoss)

3-3) Hyperparameter

patch size = 304

normal threshold = 0.1  # normal mask inclusion ratio that select normal patches
tumor threshold = 0.8   # tumor mask inclusion ratio that select tumor patches
tissue threshold = 0.4  # tisse mask inclusion ratio that select tissue patches

default learning rate = 0.005  # defalut learning ratio
momentum = 0.9      # SGD optimizer parameter, 'momentum'
weight decay = 5e-4 # SGD optimizer parameter, 'weight_decay'

4. Validation, Test

4-1) Statistical index

4-2) Checkpoint

  • Info: Net, Accuracy, Loss, Recall, Specificity, Precision, F1_score, AUC, epoch, learning rate, threshold

5. Result

  • First trial

  • Second trial

6. HeatMap

eval.py
  • HeatMap Example

7. Requirement

8. Usage

  1. Download the image zip files in camelyon17.
  2. To create the dataset, run the utils.py.
  3. Using the patches, train the model in train.py.
  4. Run the eval.py.

9. Reference