Convolutional neural network model for Histopathologic Cancer Detection based on a modified version of PatchCamelyon dataset that achives >0.98 AUROC on Kaggle private test set.
To reproduce my solution without retraining, do the following steps:
All requirements should be detailed in requirements.txt. Using virtual environment is strongly recommended.
$ python3 -m venv venv_hist
$ source venv_hist/bin/activate
$ pip install -r requirements.txt
Download and extract and to data directory.
$ kaggle competitions download -c histopathologic-cancer-detection
$ wget
$ unzip -d data
$ unzip -d data
$ rm *.zip
$ python tools/
In the configs directory, you can find configurations I used to train my final models.
To train models, run following commands.
$ python --config_name {config_path}
You can download pretrained model that were used for my final model from link
$ mkdir -p weights
$ bash
If trained weights are prepared, you can create files that contain predictions for test set using testing config files from configs directory.
In order to inference a single model run:
$ python --config_name configs/test/{}
In order to inference a blend (simple mean) of several models run:
$ python --config_name configs/test/test_blend.yml
After that you can find .csv files in subs directory. Keep in mind that test predictions are generated with test time augmentation (TTA-4) by default, which makes inference several times slower.
- Training logs (training and validation metrics) are stored in /.runs directory for tensorboard.
- Basic logs (training and validation metrics, hyperparameters, scheduling info, etc) are also stored in /logs directory.
- TTA is done using great repo ttach.
- Best weights (based on AUROC on validation) are stored in /weights directory. You can find SWA code in tools/ If you decide to use it, just uncomment saving model weights at the end of each epoch.