/GMOnotebook

Templates for workflows cross-analyzing fluorescent hyperspectral regression with deep segmentation to study plant tissue culture transformation rates

Primary LanguageJupyter Notebook

GMOnotebook

Jupyter notebooks with templates for studying plant transformation rates by cross-referencing fluorescent hyperspectral and RGB comptuer vision modules

Hyperspectral and RGB images for our genome-wide association study (GWAS) of in vitro transformation and regeneration in Populus trichocarpa can be found here. These images were analyzed with GMOdetector release v0.62.

Description

The GMOdetector workflow provided here is a means of studying plant transformation using images collected by the macroPhor Array imaging platform (Middleton Spectral Vision). For each petri dish sample, two images are collected: one with a conventional RGB camera, and another by fluorescent hyperspectral imaging. The RGB images are analyzed by two convolutional neural networks: the first being a semantic segmentation model of DeepLabv3+ architecture, to segment pixels of regenerating tissues such as callus and shoot; the second is a DenseNet classification network to identify explants that are heavily contaminated and missing and to exclude these from further analysis. Hyperspectral images are analyzed using the CubeGLM package for hyperspectral image analysis, providing a measures of chlorophyll and reporter protein signal. Finally, the RGB and hyperspectral image layers are aligned and cross-referenced to provide measures of chlorophyll and reporter protein signal in specific regenerating tissues, such as callus and shoot.

Installation

GMOdetector is a highly modular workflow that depends on Anaconda to manage environments for separate modules. Jupyter notebooks are used to describe and organize the use of this workflow. Installation of specific modules, such as those for RGB and hyperspectral image analysis and cross-referencing outputs, is described in the setup tutorial notebook. Installation on Ubuntu or another Linux operating system is recomended.

Overview of use

  1. Before running the workflow, metadata for each petri dish sample must be prepared, and the appropriate parameters for analysis must be provided. These proceesses are described in a series of tutorial notebooks with examples.
  2. The workflow itself can be deployed directly via a Jupyter notebook for a single dataset (defined as a set of images collected for a given experiment at a given timepoint), or using a higher-level pipeline to deploy the workflow over many datasets.
  3. Many statistical outputs are provided, including regeneration and transformation frequencies across plates, and various measures of chlorophyll and reporter protein signal as well as tissue size for individual explants on plates. These statistics are presented directly as plots. Tests for significance of treatment effects on these statistics are also included and results are output as spreadsheets. Descriptions of the various outputs and how to interpret them can be found in these notebooks.

Overview of modules under the hood

  1. Hyperspectral analysis relies on the CubeGLM (formerly gmodeetector_py) Python package. This package works by computing weights for known spectral components indicated by the user (e.g. chlorophyll and GFP) across all pixels of a given fluorescent hyperspectral image.
  2. Semantic segmentation of RGB images according to tissue type (e.g. callus, shoot, unregenerated stem explant material, and contamination) is accomplished via a neural network model based on the DeepLabV3+ architecture.
  3. Binary classification of individual explants within RGB images is performed (using a DenseNet neural network model) to classify individual explants on the basis of whether they are heavily contaminated or missing, and should thus be excluded from further analysis. Our implementation of the DenseNet model and associated scripts can be found here.
  4. Alignment of RGB and hyperspectral layers: We have implemented two approaches to accomplish this alignment, which is necesary due to the images being taken by different cameras with different resolutions and frames. These approaches are described in [these notebooks] and our repository for the opencv-based method can be found in this repository.
  5. Once the RGB and hyperspectral layers are aligned, we cross-reference the layers and compute statistics based on reporter protein and chlorophyll signal in specific tissues such as callus and shoot. Finally, plots are produced and statistical tests are performed. These final tasks are accomplished with a series of Python and R scripts found in the GMOlabeler repository

Version

The current version is 0.62 (October 15, 2022).

Acknowledgements

We thank the National Science Foundation Plant Genome Research Program for support (IOS #1546900, Analysis of genes affecting plant regeneration and transformation in poplar), and members of GREAT TREES Research Cooperative at OSU for its support of the Strauss laboratory.