ccf-research-ftu: A Jupyter Notebook repository from Cyberinfrastructure for Network Science Center at Indiana University - Cyberinfrastructure for Network Science Center at Indiana University

Common Coordinate Framework (CCF) Research on Functional Tissue Units (FTU)

FTU Segmentation through Machine Learning (ML) Algorithms
Explore the docs »

About the Project
- Data
- Algorithms
Documentation
ML Pipelines
Contributing
License
Contact
Acknowledgements

About The Project

The Human BioMolecular Atlas Program (HuBMAP) aims to create an open, global atlas of the human body at the cellular level. One component of this overarching goal is to identify glomeruli, functional tissue units (FTUs) consisting of capillaries that facilitate filtration of blood, within whole slide images of kidney tissue. Once these glomeruli are detected in the microscopy images, information on size and location within the kidney samples can be used to build a spatially accurate model of human kidneys for the HuBMAP.

Manual identification and classification of FTUs from microscopy images requires highly trained experts and is labor intensive. Many ML algorithms have been previously applied to automate detection of glomeruli. For this work, the Faster R-CNN and Mask R-CNN ML algorithms were utilized to detect glomeruli in Periodic acid-Schiff (PAS) stained whole slide images of kidney tissue samples. Faster R-CNN and Mask R-CNN are both convolutional neural network (CNN) algorithms designed for object detection in images. The Faster R-CNN algorithm outputs bounding boxes around identified glomeruli, which are described by x-min, x-max, y-min, and y-max measures within the context of the original slide image. The Mask R-CNN algorithm takes this a step further, outputting a pixel-wise map of glomeruli vs. non-glomeruli sections of arbitrary shape from the input image. A Future work will include application of more ML algorithms, such as AlexNet, to FTUs from other tissues, such as colonic crypts.

Data

Our raw microscopy image data is provided by Tissue Mapping Centers (TMCs) affiliated with HuBMAP. The first Data Portal release has made this data open access and free for anyone's use.

Current work has focused on segmentation of glomeruli in PAS stained kidney whole slide images. Manual annotations of glomeruli within these images were produced as training material for the algorithms. The output segmentation results vary in form and include bounding boxes and binary, pixel-wise masks.

Future work will also incoorporate alternative imaging methods and tissue types.

Kidney

TMC-Vanderbilt Raw Data

Colon

This data has yet to be segmented by our ML algorithms, but it is the focus of future developments.

Algorithms

Faster R-CNN

The Faster RCNN algorithm is one type of CNN used for object detection in images. CNNs employ neural networks for deep learning and allow unsupervised feature generation. This algorithm takes an image as input, which it then divides into smaller rectangular regions. From then on, it considers each region to be a separate image. Next, these regions are passed to the CNN, which provides classes and bounding boxes for detected objects. In the case of kidney segmentation, the classes are ”Glomeruli” or ”Non-Glomeruli”. After this is complete for all regions, they are combined to make the original image with glomeruli detected in rectangular boxes. The algorithm outputs the data describing these detection boxes as separate rows in a .csv file which describes each annotation prediction as a single row of data. Fields include ”filename”, ”xmin”, ”xmax”, ”ymin”, and ”ymax”. The ”filename” refers to the unique number given to the region of the original image where the annotation was detected.

Mask R-CNN

The Mask RCNN algorithm is built upon the Faster RCNN algorithm, but it employs an instance segmentation extension that allows prediction of segmentation masks for each annotation. Rather than relying on the rectangular regions of the Faster RCNN algorithm for outputting detection boxes, the Mask RCNN provides a classification of ”Glomeruli” or ”Non-Glomeruli” to each pixel in the original image. This allows the resulting annotations to be any shape describable by pixels and enables the creation of binary mask overlays for use on the original image.

AlexNet

The architecture of Alexnet consists of eight layers: five convolutional layers and three fully-connected layers. Rectified Linear Unit (ReLU) nonlinearity is applied after all the convolution and fully-connected layers. The ReLU nonlinearity of the first and second layers of convolution follow a local normalization step before pooling. To classify and detect glomeruli in WSIs, a pre-trained AlexNet model was used to distinguish glomeruli through pixel-wise classification and segmentation, constructing a binary mask containing glomeruli in WSIs. AlexNet requires an input form of 227x227 pixels, so an augmented training dataset of that resolution was used with this method. Once the classification model was trained, the 227-pixel-height horizontal strips of WSI were input for the model to provide predictions. In order to predict glomeruli regions in their respective tiles, pixel-wise analysis was conducted at each 227x227 pixel fraction of these horizontal strips using a sliding window. Once the segmentation was completed, all the tiles were stitched together to form a binary glomeruli mask of the WSI.

Documentation

Refer to each algorithm's documentation for how it was implemented.

ML Pipelines

(Insert AlexNet Pipeline image here.)

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Yingnan Ju - yiju@iu.edu

Leah Scherschel - @LeahScherschel - llschers@iu.edu

Project Link: https://github.com/cns-iu/ccf-research-ftu

Acknowledgements

HuBMAP
TMC-Vanderbilt
TMC-Stanford