The problem we are addressing is the manual detection of cancer metastases. Metastases are secondary cancer growths that occur when cancer cells spread from the primary tumor to other parts of the body. Because the pathology images are very high-resolution, reaching gigapixel sizes. the tumors can be relatively small. Detecting these metastases manually is not only time-consuming but also prone to human error. Therefore, this is where automated detection comes to the rescue. And we aim to improve the accuracy and efficiency of cancer diagnosis.
The data pipeline includes five steps:
- Initial data analysis
- Data preprocessing
- Modeling
- Data post processing
- Prediction and evaluation
- Training and evaluating our models was challenging because of the large number of patches and the tumor class imbalance.
- Another major challenge is trying to extract features that are meaningful from a clinical point of view.
- Preprocessed data, SavedModels, Checkpoints: https://drive.google.com/open?id=1gWOCgU_nzW4c0HIVRuekGYUJsLDzhZOe
- Slides: https://drive.google.com/open?id=1GxeSs6jFx3lYgDsfweh2fXYW1aRu8p92
- Preprocessed Data:
- 1)_Final_Initial_Data_Analysis_and_Preprocessing.ipynb: the Source code for downloading data; preprocessing data, extracting patches and labels; and saving precomputed preprocessed data to Gdrive
- 2)_Model_Pipeline_and_Evaluation.ipynb: the source code for loading precomputed data, creating model, performing predictions and drawing heatmaps
- Experiments: expeirment with different model and level
- level4_inception_2)_Model_Pipeline_and_Evaluation.ipynb
- level4_vgg_2)_Model_Pipeline_and_Evaluation.ipynb
- level5_vgg_2)_Model_Pipeline_and_Evaluation.ipynb
Since the codes are run in collaboratory, make sure you have accesses to the files
- Deka Auliya Akbar
- Yuting Wang
https://youtu.be/jJYXM844D5o If you have any questions, feel free to comment.