Input functions for training neural networks on WSI histology images using tf.dataset input pipelines to gather random WSI patches at runtime.
This code was created by Brendon Lutnick
An arXiv paper describing this method is avalable
Abstract:
We have created a custom input pipeline to efficiently extract random patches and labels from whole slide images (WSIs) for input to a neural network. We use a Tensorflow backend to prefetch these patches during network training, avoiding the need for WSI preparation such as WSI chopping prior to training.. This code is setup to randomly prefetch and augment patches from WSIs at training time efficiently on the CPU.
This should be wrapped in a tf.py_function for real time evaluation at training time.
"dataset_util.py" contains the main functions:
- save_wsi_thumbnail_mask() - this is run automatically if needed to create masks of the tissue regions
- get_random_wsi_patch() - function for getting stochastic regions from WSIs | wrapped in a tf.py_function() for use in network
- get_slide_label() - an example of how to read slide labels from a master excel sheet
See: "example_usage.py" for more info on the use of these functions.
Examples of this code in use are avalable:
This code was developed using Ubuntu Linux, running tensorflow-gpu 1.15
- Python
- OpenSlide
- OpenSlide-tools
- Tensorflow >= 1.14
- OpenCV
- Pandas
- Skimage
- Matplotlib
- Pillow
- Scipy
- Numpy