/dca_artifact_removal

Source code for the paper: "Dermoscopic Dark Corner Artifacts Removal: Friend or Foe?"

Primary LanguageJupyter Notebook

Dermoscopic Dark Corner Artifacts Removal: Friend or Foe?

Citation

If you use any methods, data, or code from this repository please consider citing our paper:

@article{pewton2023dca,
  title = {Dermoscopic dark corner artifacts removal: Friend or foe?},
  journal = {Computer Methods and Programs in Biomedicine},
  volume = {244},
  pages = {107986},
  year = {2024},
  issn = {0169-2607},
  doi = {https://doi.org/10.1016/j.cmpb.2023.107986},
  author = {Samuel William Pewton and Bill Cassidy and Connah Kendrick and Moi Hoon Yap}
}

Masks

If you only require the dark corner artifact masks from these experiments to use in your own dataset, they can be downloaded from the following Kaggle Database repository:

https://www.kaggle.com/datasets/mmucomputervision/dark-corner-artifact-masks-for-isic-images

Requirements

  1. Datasets: - ISIC unbalanced dataset (Duplicates removed).. follow guide at https://github.com/mmu-dermatology-research/isic_duplicate_removal_strategy - save this dataset within the Data directory. - Fitzpatrick 17k.. follow guide at https://github.com/mattgroh/fitzpatrick17k - save this dataset within the Data directory. - DCA Masks.. use "Generate all DCA masks" method at https://github.com/mmu-dermatology-research/dark_corner_artifact_removal and save results within Data directory. ./Data/DCA_Masks/
  2. Models:
  3. Installations:
    • Python 3.9.7
      • Anaconda 4.11.0
      • pandas 1.3.5
      • numpy 1.21.5
      • scikit-learn 1.0.2
      • scikit-image 0.16.2
      • Jupyter Notebook
      • matplotlib 3.5.0
      • OpenCV 4.5.5
      • Pillow 8.4.0
      • Tensorflow 2.9.0-dev20220203
      • Tensorflow-GPU 2.9.0-dev20220203
      • CUDA 11.2.1
      • CuDNN 8.1
      • Keras

Generating the dca split dataset

  1. Open "./Modules/create_balanced_dca_dataset.py" module
  2. Read through docstring for module carefully - changing filepaths as necessary
  3. Execute the module

Project Steps

  1. Train the models: train three InceptionResNetV2 networks on each of the training/validation sets to form a model on the clean set, a model on the binary dca set, and a model on the realistic dca set. Refer to the paper for more information on the network hyper-parameters.
  2. Score the models: score the each of the models on each of the individual test sets, this can be done with the model_performance.py module.
  3. Extract the gradcam heatmaps from all images: run the extract_gradcam.ipynb notebook. (ensure that all of the required filepaths are uncommented)
  4. Calculate the brightness intensities for each of the test set images: modify the base image filepath in the split_intensity.py module to reflect the root folder of the extracted heatmaps. Run the script to generate a .csv file for the internal and external brightness measures for each image. Once this is complete, run the calculate_intensity_averages.py module to calculate the averages across all of the images.

Full Model Performances on all individual testing sets:

Model Used Test Set Metrics Micro-Average
AccTPRTNRF1AUCPrecision
Cleanbase-small0.590.860.320.680.630.56
ns-small0.590.860.310.680.620.56
telea-small0.590.860.310.680.620.56
base-medium0.570.910.240.680.640.54
ns-medium0.620.880.360.700.680.58
telea-medium0.620.870.360.690.680.58
base-large0.510.990.010.670.580.50
ns-large0.640.850.440.700.710.60
telea-large0.650.850.450.710.710.61
base-oth0.580.900.260.670.650.55
ns-oth0.580.870.290.670.660.55
telea-oth0.580.870.290.670.660.55
Binary DCAbase-small0.610.900.330.700.670.57
ns-small0.610.890.330.700.670.57
telea-small0.610.890.330.700.670.57
base-medium0.630.940.310.720.680.58
ns-medium0.650.850.440.710.730.60
telea-medium0.650.850.450.700.730.61
base-large0.550.960.130.680.620.53
ns-large0.700.790.610.730.750.67
telea-large0.700.780.610.720.750.67
base-oth0.600.830.360.670.670.57
ns-oth0.600.820.390.670.680.57
telea-oth0.600.820.390.670.680.57
Realistic DCAbase-small0.600.850.350.680.650.57
ns-small0.600.850.350.680.660.57
telea-small0.600.840.360.680.660.57
base-medium0.640.750.530.680.700.62
ns-medium0.660.840.480.710.720.62
telea-medium0.660.820.490.710.730.62
base-large0.600.390.800.490.630.66
ns-large0.660.700.630.680.740.65
telea-large0.670.690.650.670.740.66
base-oth0.580.810.350.660.650.55
ns-oth0.580.790.370.650.650.56
telea-oth0.580.790.370.650.650.56

References

@article{groh2021evaluating,
  title   = {Evaluating Deep Neural Networks Trained on Clinical Images in Dermatology with the Fitzpatrick 17k Dataset},
  author  = {Groh, Matthew and Harris, Caleb and Soenksen, Luis and Lau, Felix and Han, Rachel and Kim, Aerin and Koochek, Arash and Badri, Omar},
  journal = {arXiv preprint arXiv:2104.09957},
  year    = {2021}
}

@article{cassidy2021isic,
 title   = {Analysis of the ISIC Image Datasets: Usage, Benchmarks and Recommendations},
 author  = {Bill Cassidy and Connah Kendrick and Andrzej Brodzicki and Joanna Jaworek-Korjakowska and Moi Hoon Yap},
 journal = {Medical Image Analysis},
 year    = {2021},
 issn    = {1361-8415},
 doi     = {https://doi.org/10.1016/j.media.2021.102305},
 url     = {https://www.sciencedirect.com/science/article/pii/S1361841521003509}
} 

@misc{rosebrock_2020, 
 title   = {Grad-cam: Visualize class activation maps with Keras, tensorflow, and Deep Learning}, 
 url     = {https://pyimagesearch.com/2020/03/09/grad-cam-visualize-class-activation-maps-with-keras-tensorflow-and-deep-learning/}, 
 journal = {PyImageSearch}, 
 author  = {Rosebrock, Adrian}, 
 year    = {2020}, 
 month   = {3},
 note    = {[Accessed: 10-03-2022]}
} 

@article{scikit-image,
 title   = {scikit-image: image processing in {P}ython},
 author  = {van der Walt, {S}t\'efan and {S}ch\"onberger, {J}ohannes {L}. and
           {Nunez-Iglesias}, {J}uan and {B}oulogne, {F}ran\c{c}ois and {W}arner,
           {J}oshua {D}. and {Y}ager, {N}eil and {G}ouillart, {E}mmanuelle and
           {Y}u, {T}ony and the scikit-image contributors},
 year    = {2014},
 month   = {6},
 keywords = {Image processing, Reproducible research, Education,
             Visualization, Open source, Python, Scientific programming},
 volume  = {2},
 pages   = {e453},
 journal = {PeerJ},
 issn    = {2167-8359},
 url     = {https://doi.org/10.7717/peerj.453},
 doi     = {10.7717/peerj.453}
}

@article{scikit-learn,
 title   = {Scikit-learn: Machine Learning in {P}ython},
 author  = {Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
         and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
         and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
         Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
 journal = {Journal of Machine Learning Research},
 volume  = {12},
 pages   = {2825--2830},
 year    = {2011}
}

@inproceedings{lim2017enhanced,
  title     = {Enhanced deep residual networks for single image super-resolution},
  author    = {Lim, Bee and Son, Sanghyun and Kim, Heewon and Nah, Seungjun and Mu Lee, Kyoung},
  booktitle = {Proceedings of the IEEE conference on computer vision and pattern recognition workshops},
  pages     = {136--144},
  year      = {2017}
}