The Medical AI industry is deeply rooted in invention. Data, technologies, and artificial intelligence (AI) algorithms create innovative products that diagnose, treat, and prevent cancer and other disease. Though, the rate at which AI can advance healthcare rests on on how rapidly the industry and healthcare leaders embrace it. This change is most apparent in medical imaging, where deep learning is controlling the research.
It is my pleasure to present this Deep Learning in Medical AI series which has been designed to highlight and summarize key concepts in the use of Deep Learning in medicine. This series set specifically focuses on Radiology, Oncology and Pathology.
The area of deap learning research in oncology is a challenging and ever changing environment. Within this environment, we all value access to publicly available and annotated data and research which helps to educate and inspire further advancements in our roles as data scientists.
I hope you find this series of benefit to you in your practice. If you would like to share your thoughts with us we would welcome your comments. Please send any correspondence to shlomo@deeponcology.ai.
Finally, we are also very grateful to Google for their administrative and logistical support (e.g. GPU’s) in the realization of this activity.
- https://github.com/deeponcology/applied-dl-2018
- The curriculum of the first 2018 course: http://deep-ml.com/assets/5daydeep/#/
Shlomo Kashani, Head of AI at DeepOncology AI, Kaggle Expert, Founder of Tel-Aviv Deep Learning Bootcamp: shlomo@deeponcology.ai
Knowledge of python programming Basics of linear algebra and statistics.
Google Collab, Google Cloud, Python Jupyter
PyTorch is an open source deep learning framework that’s quickly become popular with AI researchers for its ease of use, clean Pythonic API, and flexibility. With the preview release of PyTorch 1.0, developers can now seamlessly move from exploration to production deployment using a single, unified framework.
- Google Collab setup
- Running a CUDA program in C from Python
- PyTorch Tensors on the GPU
- Basics of PyTorch Data Loaders
- Standard PyTorch Augmentations (Transforms)
- Writing custom PyTorch Augmentations (RandomErasing)
- SOTA Augmentation libraries (Albumentations)
- Histopathology Images
- Lab 01 (old version)- Melanoma Classification: https://github.com/bayesianio/applied-dl-2018/blob/master/lab-0-SeNet-SeedLings.ipynb and https://bayesian-ai.trydiscourse.com/t/12-applied-deep-learning-labs-1-melanoma-detection/20
- Lab 02 (old version)- Breast Cancer Classification: https://github.com/bayesianio/applied-dl-2018/blob/master/lab-2-Breast-Cancer-Histopathology-SeNet.ipynb and https://bayesian-ai.trydiscourse.com/t/12-applied-deep-learning-labs-2-breast-cancer-classification/21
Deep learning, a sub-domain of machine learning, has lately showed amazing results across an assortment of domains. Biology and medicine are data affluent, but the data is involved and frequently ill-understood. Problems of this quality may be especially well-suited to deep learning methods.
This is a provisional curriculum, which is subject to change without notice.
- Ubuntu Linux 16.04, Mac OSX or Windows 10
- Python 3.5+ or above
- CUDA 9.2 drivers.
- cuDNN 7.0.
- PyTorch and torchvision wheels are available on http://pytorch.org
- pytorch >= 0.4.0
- torchvision
- Pillow
- scipy
- tqdm
Keep in mind that this repository expects data to be in same format as Imagenet. I encourage you to use your own datasets. In that case you need to organize your data such that your dataset folder has EXACTLY two folders. Name these 'train' and 'val'
The 'train' folder contains training set and 'val' fodler contains validation set on which accuracy / log loss is measured.
The structure within 'train' and 'val' folders will be the same. They both contain one folder per class. All the images of that class are inside the folder named by class name; this is crucial in PyTorch.
If your dataset has 2 classes like in the Kaggle Statoil set, and you're trying to classify between pictures of 1) ships 2) Icebergs, say you name your dataset folder 'data_directory'. Then inside 'data_directory' will be 'train' and 'test'. Further, Inside 'train' will be 2 folders - 'ships', 'icebergs'.
|- data_dir
|- train
|- ships
|- ship_image_1
|- ship_image_2
.....
|- ice
|- ice_image_1
|- ice_image_1
.....
|- val
|- ships
|- ice
For a full example refer to: https://github.com/QuantScientist/Deep-Learning-Boot-Camp/blob/master/Kaggle-PyTorch/PyTorch-Ensembler/kdataset/seedings.py