TGMM in KNIME

Knime workflow to automate the use of the software Tracking by Gaussian Mixture Models (TGMM, Amat et al.) for 3D cell segmentation and cell-lineage tracking in 3D+time image-sequences.
TGMM was designed to work with dataset of cells with fluorescently labelled nuclei (ex: H2B-mCherry in developping drosophila embryos).

Results can be viewed either in Knime or displayed better using MaMut plugin in FIJI.

The IDRdownloader python script in this repository allows downloading additional datasets from the Image Data Resource database (https://idr.openmicroscopy.org/).
It is currently set to download an appropriate dataset from the Keller lab (source of the TGMM software) (https://idr.openmicroscopy.org/webclient/img_detail/4007801/?dataset=3351).

Requirements

TGMM requires a GPU with CUDA support.
This KNIME workflow is only compatible with TGMM for Windows, and was tested with Knime 4.4.0.
The Knime extensions (Image Processing and External Tools) needed by the workflow will be installed automatically by Knime upon opening the workflow.

It is recommended to set the logging level in KNIME to DEBUG to visualize the output of the TGMM exectuables called by the workflow directly in the console output of KNIME.
To do so, go to Preferences > KNIME > KNIME GUI and select DEBUG.

The original TGMM repository contains the necessary software to run this workflow (TGMM 1.0) as well as some documentation/user-guide and an example dataset under data>data.
You can find it at: https://sourceforge.net/projects/tgmm/files/.
TGMM 1.0 is also available at https://git.rcc.uchicago.edu/open-source/TGMM but this version was returning CUDA errors with out configuration.
TGMM 2.0 is also available but not compiled. There is also some documentation in the doc directory of this repo https://bitbucket.org/fernandoamat/tgmm-paper/src/master/.

Citations

If you use this workflow for your research, please cite the original publication

Amat, F., Lemon, W., Mossing, D. P., McDole, K., Wan, Y., Branson, K., Myers, E. W. and Keller, P. J. (2014).
Fast, accurate reconstruction of cell lineages from large-scale fluorescence microscopy data.
Nature Methods 11, 951–958,
doi:10.1038/nmeth.3036

as well as the DOI of this repository from Zenodo (see at the top of this page).
If you have a github account, you can also "star" the repo (icon on the top right) ;)

Expected datasets

We recommend to use 3D dataset (also 2D is supposed to work for some values of connectivity) and to process at least 10 timepoints.
Errors were observed using less timepoints.

Z-stack for each timepoints have to be saved in separate .tif files (single "f" not sure if "tiff" double-f are working, TGMM is matching image filenames against a wildcard pattern and by appending the tif extension).

Additionally, the filenames should contain the timepoint, and filenames shoud be identical (except the timepoint) between the images. The reason is that the image location is provided to the workflow (ad to TGMM), as a wildcard pattern for the filepath, with ??? in-place of the timepoints, and no extension.
Example to match the following images:

myPath/frame0001.tif
myPath/frame0002.tif

One should use the pattern myPath/frame???? with 4 ? since all frames have a timepoint encoded as a 4-digit value.

Processing steps

This workflow includes the different functions:

Hierarchical segmentation of 3D images into supervoxels

This is performed by the node "Pyramidal Segmentation Hierarchy".

This first step includes:

reducing noise in images using median filter (use CUDA)
identification of foreground regions using the "background threshold" parameter
watershed
Persistence Based Clustering (PCB), creating the hierarchy of segmentation levels

The node is internally calling ProcessStackBatchMultiCore.exe <TGMMconfig.txt> <firstTimepoint> <lastTimepoint>.
The config file is automatically generated from the parameters provided in the GUI of the node (refer to the mouse-over description of the GUI elements and/or to the TGMM user-guide for a more detailed description of the parameters).
This command actually parallelizes ProcessStack.exe <TGMMconfig.txt> <image> called with individual Z-stacks of timepoints (the <image>), and using as many CPU cores as available (ie timepoints are processed in parallel).

The output is a hierarchical segmentations for each timepoint, saved as .bin files in the image directory.
These bin files can be used to derive different segmentations, by choosing a cut-off (Tau), ie to a Tau value corresponds one segmentation level.
By selecting a Tau, the hierarchical segmentation is "cut" to a given level, yielding a first set of supervoxel (next workflow).

Note: ProcessStack can also be called to generate such bin file for single timepoints Z-stacks by calling
ProcessStack.exe <image> <radiusMedianFilter> <minTau> <backgroundThr> <conn3D>

Visualization of the segmentation for a given Tau (optional)

This is calling ProcessStack.exe <binFile> <Tau> <minSuperVoxelSize>.
This allows to visualize the resulting segmentation for a given Tau and minimum nuclei size (also called supervoxel size).
It is basically a way to optimize Tau before running the tracking (which is also asking for Tau).
It takes the .bin files generated at the previous step, and outputs segmentation mask as .tif files in the image directory.
The masks are then loaded in Knime and can be viewed overlaid on the original images.
From the original publication "The higher the value of Tau, the coarser the segmentation, as more image regions are merged."

Tracking of cells

This is calling TGMM.exe <TGMMconfig.txt> <firstTime> <lastTime>.
It performs the following steps:

from the .bin files containing the hierarchical segmentation, derive a segmentation corresponding to the selected Tau (what is done in the optional step above)
remove supervoxels below minNucleiSize
apply Otsu thresholding followed by filtering with maxNucleiSize
Fit gaussians on the supervoxels/nuclei (the Gaussian Mixture Model)
Establish cell tracks and lineage based on the gaussian fit

The output is a set of XML file (one per timepoint).
The XML contains the coordinates of the gaussian fitted on the nuclei, as well as track/lineage information.
The XML files can be loaded in Fiji using MaMut to view the nuclei rendered from the gaussian fits and their tracks.

Display of localized nuclei

Takes .xml files, allows visualization of centroids overlaid on the original images.
The color of the centroid labeling is determined by the index of the cell lineage (ie cells from the same lineage have the same labeling color).

MaMut import

To import results in MaMut, all images need to be integrated into a hyperstack with the correct number of slices and frames in FIJI.

Then, the Big Data Viewer plugin can be used to export the stack to hdf5 + xml.

The Import TGMM results into MaMut submenu of MaMut takes the path to the GMEMtracking3D_\XML_finalResult_lht directory (created by TGMM in your results folder) and the XML created by Big Data Viewer.

It produces a new XML file, which is used by MaMut in the Open MaMut annotation submenu and directly provides access to the different viewers provided by MaMut.

LauLauThom/knimeTGMM_reloaded