More stuff from us
This is a DWT-inspired solution to the Kaggle's 2018 DS Bowl I produced within approximately 1 week before the end of the compeititon.
UPDATE 2018-04-22 - my score was 114th. I guess they are cleaning the LB in the end.
UPDATE 2018-04-24 - found out why my model generalized poorly - I forgot to re-create optimizer
after unfreezing the encoder weights.
Most prominently it features a dockerized PyTorch implementation of approach similar to Deep Watershed Transform.
Since the target metric was highly unstable (average mAP on 0.5 - 0.95 thresholds) and the private LB contained data mostly not related to the train dataset, it's a bit difficult to evaluate code performance, but it's safe to say that:
- The performance is for single model on one fold;
- ~400th place in stage 1 LB and ~100th place in stage 2 LB (most likely the position will rise, since the LB is not finalized yet);
- I did not invest time in ensembling / folding / annotation etc because I entered late and it was obvious that second stage would be a gamble given the quality of the dataset and organization;
Key take-aways:
- People have reported that VGG based models overfitted heavily on train/val and heavy Resnet models (e.g. Resnet152) were key. I did not test heavy Resnets, so it's very likely that performance can be impoved greatly;
- Top solutions also featured learning 2 masks - gt mask + a boundary BETWEEN nuclei;
- DWT / energy greatly helps - it gave ~0.07-0.09 mAP locally and ~0.05 - 0.07 on the LB;
- All other promising post-processing techniques (e.g. detecting centers of the nuclei with blob_log and using them as additional energy level) did not work;
- The base model is very powerful, but probably prone to overfitting due to dataset being of very low quality;
- Dataset curation / balancing / proper annotation - matters more than a particular architecture in this case;
Training
- 6+ core modern CPU (Xeon, i7) for fast image pre-processing (in this case distance transform takes some time for each nuclei);
- The models were trained on 2 * GeForce 1080 Ti;
- Training time on my setup ~ 6-8 hours per one fold;
- Disk space - 10GB should be more than enough, ~20GB for building a docker image;
Inference
- 6+ core modern CPU (Xeon, i7) for fast image pre-processing;
- On 2 * GeForce 1080 Ti inference takes 1-2 minutes for the public test dataset (65 images);
Clone the repository
git clone https://github.com/snakers4/ds_bowl_2018 .
This repository contains a Dockerfile used when training models
/dockerfiles/Dockerfile
- this is the main Dockerfile
Build a Docker image
cd dockerfiles
docker build -t bowl_image .
Install the latest nvidia docker
Follow instructions from here. Please prefer nvidia-docker2 for more stable performance.
To test all works fine run:
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
(IMPORTANT) Run docker container (IMPORTANT)
Unless you use this exact command (with --shm-size flag) (you can change ports and mounted volumes, of course), then the PyTorch generators WILL NOT WORK.
- nvidia-docker 2:
docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all -it -v /path/to/cloned/repository:/home/keras/notebook -p 8888:8888 -p 6006:6006 --shm-size 16G bowl_image
- nvidia-docker:
nvidia-docker -it -v /path/to/cloned/repository:/home/keras/notebook -p 8888:8888 -p 6006:6006 --shm-size 8G aveysov
To start the stopped container
docker start -i YOUR_CONTAINER_ID
- Ssh into the docker container via
docker exec -it YOUR_CONTAINER_ID
- Cd to the root folder of the repo
- Dowload the data into
data/
- Note that
data\
already contains pickled train dataframes with meta-data (for convenience only) - If kaggle removes the data download links from the competition page, you can download the data from here
After all of your manipulations your directory should look like this (omitting csv files):
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── stage1_test <- A folder with stage1 test data
│ ├── stage2_test <- A folder with stage2 test data
│ ├── test_df_stage1_meta <- A pickled dataframe with stage1 test meta data
│ ├── train_df_stage1_meta <- A pickled dataframe with stage1 train meta data
│ └── stage1_train <- A folder with stage1 train data
│ ├─ f8e74d4006dd68c1dbe68df7be905835e00d8ba4916f3b18884509a15fdc0b55
│ │ ├── images
│ │ └── masks
...
│ └─ ff599c7301daa1f783924ac8cbe3ce7b42878f15a39c2d19659189951f540f48
│
├── dockerfiles <- A folder with Dockerfiles
│
└── src <- Source code
You see the list of the available model presets in src/models/model_params.py
The best model according to my tests was Unet16 (Unet + Vgg16 pre-trained encoder)
If all is ok, then use the following command to train the model
- Ssh into the docker container via
docker exec -it YOUR_CONTAINER_ID
- Cd to the root folder of thre repo
cd src
- optional - turn on tensorboard for monitoring progress
tensorboard --logdir='ds_bowl_2018/src/tb_logs --port=6006
via jupyter notebook console or via tmux + docker exec (model converges in 100-150 epochs) - then for example train on 2 folds
echo 'python3 train_energy.py \
--arch unet16_160_7_dc --epochs 150 --workers 10 \
--channels 7 --batch-size 12 --fold_num 0 \
--lr 1e-3 --optimizer adam \
--bce_weight 0.9 --dice_weight 0.1 --ths 0.5 \
--print-freq 1 --lognumber unet16_160_7_dc_ths5_energy_distance_gray_final \
--tensorboard True --tensorboard_images True --is_distance_transform True --is_boundaries True \
--freeze True \
python3 train_energy.py \
--arch unet16_160_7_dc --epochs 150 --workers 10 \
--channels 7 --batch-size 12 --fold_num 1 \
--lr 1e-3 --optimizer adam \
--bce_weight 0.9 --dice_weight 0.1 --ths 0.5 \
--print-freq 1 --lognumber unet16_160_7_dc_ths5_energy_distance_gray_final \
--tensorboard True --tensorboard_images True --is_distance_transform True --is_boundaries True \
--freeze True \' > train.sh
sh train.sh
- Ssh into the docker container via
docker exec -it YOUR_CONTAINER_ID
- Cd to the root folder of the repo
cd src
- then
echo 'python3 train_energy.py \
--arch unet16_64_7_dc --channels 7 --batch-size 1 --ths 0.5 \
--lognumber unet16_64_7_dc_ths5_energy_distance_gray_longer_rerun \
--workers 0 --predict' > predict.sh
-
sh predict.sh
-
note that the
lognumber
is the lognumber you specified when training -
please check which fold is used in the prediction loop
-
You can also run evaluation-only scripts like this
python3 train_energy.py \
--evaluate \
--resume weights/unet16_160_7_dc_ths5_energy_distance_gray_final_fold2_best.pth.tar \
--arch unet16_160_7_dc --epochs 50 --workers 10 \
--channels 7 --fold_num 2 \
--ths 0.5 --is_distance_transform True --is_boundaries True \
--print-freq 10 --lognumber eval_validation --tensorboard_images True \
- The model is analogous to DWT since it uses predicted energy for watershed;
- The best performing wateshed post-processing scripts is in
utils.watershed.energy_baseline
; - All the other functions in
utils.watershed
performed worse;
- The model randomly crops images when training and resizes them when predicting;
- An unfinished
src/train_energy_pad.py
is also available. It works, but produces inferior quality;
Use these notebooks on your own risk!
src/bowl.ipynb
- general debugging notebook with new models / generators / etc