/whole_brain_analysis

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

WHOLE BRAIN ANALYSIS PIPELINE

This repository contains the source code for the paper:

Whole-brain tissue mapping toolkit using large-scale highly multiplexed immunofluorescence imaging and deep neural networks


This pipeline is for processing multispectral fluorescence 2D image datasets. It will correct the multiplexed images for:

  • Pixel-to-pixel registration due to stage misalignemnt between rounds
  • Intra-channel non-specific signal correction of autofluorescence, non-uniform illumianation shading, Photo-pleaching and tissue folds
  • Inter-channel non-specific signal correction of spectral bleed-through and molecular co-localization

And generate quantitative readouts for each cell including:

  • Cell nuclei location
  • Cell type
  • Cell status

Setup:

The pipeline is supported for Windows and Linux with CUDA-enabled GPU and enough RAM depending on the dataset size.

Python Dependencies

  • numpy
  • scipy
  • pandas
  • cython
  • requests
  • progressbar
  • scikit-learn
  • scikit-image==0.16.1
  • tifffile
  • opencv-python
  • tensorflow-gpu==1.9.0
  • pycocotools
  • keras == 2.2.0
  • h5py

setup_env.py creates a conda environment (brain) and installs the required libraries.

python setup_env.py

Note: tensorflow-gpu library requires CUDA toolkit 9.0 and cuDNN 7.0.5. You can follow the instructions from here to install the TensorFlow and dependencies.

For detection module you need to install protoc. Download executable from here and run the following command from DETECTION/lib directory (read more):

# from DETECTION/lib
protoc object_detection/protos/*.proto --python_out=.

Setting up the environment and installing the requirements takes ~45 minutes.

1. Reconstruction

Parse the arguments to main_reconstruction.py:

python main_reconstruction.py \
   --INPUT_DIR /path/to/input/dir \
   --OUTPUT_DIR /path/to/output/dir \
   --MODE supervised \
   --SCRIPT /path/to/script.csv
python main_reconstruction.py \
   --INPUT_DIR /path/to/input/dir \
   --OUTPUT_DIR /path/to/output/dir \
   --MODE unsupervised \
   --DEFAULT_BOX 34000 8000 44000 15000 \
   --BRIGHTFIELD 11

Arguments:

  • INPUT_DIR : Path to the directory containing input images

  • OUTPUT_DIR : Path to the directory saving output images

  • MODE: supervised or unsupervised

  • if unsupervised:

    • DEFAULT_BOX : [xmin, ymin, xmax, ymax]

    Alt text

    • BRIGHTFIELD : Number of Brightfield channel or None

    Will generate a script in OUTPUT_DIR/script.csv like follows:

    filename biomarker intra channel correction inter channel correction channel 1 channel 2 channel 3 xmin ymin xmax ymax
    R1C1.tif Yes Yes 34000 8000 44000 15000
    R1C10.tif Yes Yes R1C7.tif R1C5.tif 34000 8000 44000 15000
    R1C11.tif Brightfield No No 34000 8000 44000 15000
    R1C2.tif Yes Yes R1C1.tif 34000 8000 44000 15000
    R1C3.tif Yes Yes 34000 8000 44000 15000
    R1C4.tif Yes Yes R1C1.tif 34000 8000 44000 15000
    R1C5.tif Yes Yes R1C6.tif 34000 8000 44000 15000
    R1C6.tif Yes Yes R1C5.tif 34000 8000 44000 15000
    R1C7.tif Yes Yes R1C10.tif R1C5.tif 34000 8000 44000 15000
    R1C8.tif Yes Yes R1C7.tif 34000 8000 44000 15000
    R1C9.tif Yes Yes 34000 8000 44000 15000
    R2C1.tif Yes Yes 34000 8000 44000 15000
    R2C10.tif Yes Yes R2C8.tif 34000 8000 44000 15000
    R2C11.tif Brightfield No No 34000 8000 44000 15000
    R2C2.tif Yes Yes 34000 8000 44000 15000
    R2C3.tif Yes Yes R2C8.tif R2C6.tif 34000 8000 44000 15000
    R2C4.tif Yes Yes 34000 8000 44000 15000
    R2C5.tif Yes Yes R2C3.tif R2C6.tif 34000 8000 44000 15000
    R2C6.tif Yes Yes R2C3.tif R2C8.tif 34000 8000 44000 15000
    R2C7.tif Yes Yes R2C6.tif R2C8.tif R2C5.tif 34000 8000 44000 15000
    R2C8.tif Yes Yes 34000 8000 44000 15000
    R2C9.tif Yes Yes 34000 8000 44000 15000
  • if supervised :

    • SCRIPT : Path to the updated script.csv file for reconstruction configuration:
    filename biomarker intra channel correction inter channel correction channel 1 channel 2 channel 3 xmin ymin xmax ymax
    R1C1.tif DAPI Yes No 34000 8000 44000 15000
    R1C10.tif NFH No Yes R1C7.tif 34000 8000 44000 15000
    R1C11.tif Brightfield No No 34000 8000 44000 15000
    R1C2.tif DAPI Yes No 34000 8000 44000 15000
    R1C3.tif CC3 Yes No 34000 8000 44000 15000
    R1C4.tif NeuN Yes No 34000 8000 44000 15000
    R1C5.tif MBP No Yes R1C6.tif 34000 8000 44000 15000
    R1C6.tif RECA1 Yes Yes R1C5.tif R1C8.tif 22000 24000 32000 31000
    R1C7.tif IBA1 Yes Yes R1C10.tif R1C5.tif R1C8.tif 34000 8000 44000 15000
    R1C8.tif TomatoLectin Yes Yes R1C7.tif 34000 8000 44000 15000
    R1C9.tif PCNA Yes No 34000 8000 44000 15000
    R2C1.tif DAPI Yes No 34000 8000 44000 15000
    R2C10.tif MAP2 No Yes R2C8.tif 34000 8000 44000 15000
    R2C11.tif Brightfield No No 34000 8000 44000 15000
    R2C2.tif DAPI Yes No 34000 8000 44000 15000
    R2C3.tif GAD67 Yes Yes R2C8.tif R2C6.tif 34000 16000 44000 23000
    R2C4.tif GFAP Yes No 34000 8000 44000 15000
    R2C5.tif Parvalbumin Yes Yes R1C5.tif R2C6.tif 31000 12000 41000 19000
    R2C6.tif S100 Yes Yes R2C3.tif R2C8.tif 34000 8000 44000 15000
    R2C7.tif Calretinin Yes Yes R2C6.tif R2C8.tif 18000 1000 28000 17000
    R2C8.tif TomatoLectin Yes No 34000 8000 44000 15000
    R2C9.tif CD31 Yes No 34000 8000 44000 15000

2. Detection

Parse the arguments to main_detection.py:

  • if only DAPI:
    python main_detection.py \
       --INPUT_DIR /path/to/input/dir \
       --OUTPUT_DIR /path/to/output/dir \
       --DAPI R2C1.tif
  • if DAPI + Histones:
    python main_detection.py \
       --INPUT_DIR /path/to/input/dir \
       --OUTPUT_DIR /path/to/input/dir \
       --DAPI R2C1.tif \
       --HISTONES R2C2.tif

3. Classification

Parse the arguments to main_classification.py:

  • if first time classifying:
    python main_classification.py \
    --INPUT_DIR /path/to/input/dir \
    --OUTPUT_DIR /path/to/output/dir \
    --BBXS_FILE /path/to/bbxs_detection.txt \
    --DAPI R2C1.tif \
    --HISTONES R2C2.tif \
    --NEUN R2C4.tif \
    --S100 R3C5.tif \
    --OLIG2 R1C9.tif \
    --IBA1 R1C5.tif \
    --RECA1 R1C6.tif \
    --test_mode first \
    --thresholds 0.5 0.5 0.5 0.5 0.5
  • if you want to adjust the classification results based on new thresholds:
    --INPUT_DIR /path/to/input/dir \
    --OUTPUT_DIR /path/to/output/dir \
    --BBXS_FILE /path/to/bbxs_detection.txt \
    --DAPI R2C1.tif \
    --HISTONES R2C2.tif \
    --NEUN R2C4.tif \
    --S100 R3C5.tif \
    --OLIG2 R1C9.tif \
    --IBA1 R1C5.tif \
    --RECA1 R1C6.tif \
    --test_mode adjust \
    --thresholds .5 .5 .5 .8 .5

4. ICE/FCS File Generation

You can update the classification table by using .fcs or .ice files in FCS Express or Kaluza software to apply real time gating for phenotyping. Using bounding boxes generated from detection module you can generate .fcs and .ice files.

Parse the arguments to GenerateICE_FCS_script.py:

  • From bbox
    python PHENOTYPING/GenerateICE_FCS_script.py \
    --INPUT_DIR /path/to/input/dir \
    --OUTPUT_DIR /path/to/output/dir \
    --maskType=b \
    --maskDir /path/to/bbxs_detection.txt \
    --CHNDEF /path/to/channel/description/50_plex.csv \
    --downscaleRate 4 \  
    --seedSize 2 \
    --erosion_px 5

Arguments:

  • INPUT_DIR : Path to the directory containing input images
  • OUTPUT_DIR : Path to the directory saving output images
  • maskType: Cell detection inputs mask or bbox
  • CHNDEF: dataset definition .csv file
  • downscaleRate: for FCSexpress like visulization software,downscale the image to avoid crashing
  • seedSize: size of nuclear seed objects
  • erosion_px: pixel to shrink the bbox to focus on nuclear

5. Morphological Masking

  • Astrocyte Mask Generation:

To get soma, processes and whole cell masks for astrocytes using S100 and GFAP biomarkers run the following:

matlab -nodesktop -nosplash -r "astrocytes_whole_brain_segmentation('DAPI_PATH','E:\50-plex\final\S1_R1C1.tif', 'HISTONE_PATH','E:\50-plex\final\S1_R2C2.tif','S100_PATH','E:\50-plex\final\S1_R3C5.tif','GFAP_PATH', 'E:\50-plex\final\S1_R3C3.tif','OUTPUT_DIR','astrocytes_OUTPUT','CLASSIFICATION_table_path', 'E:\50-plex\classification_results\classification_table.csv','SEGMENTATION_MASKS','data/merged_labelmask.txt')"

Arguments:

  • DAPI_PATH : Path to DAPI channel

  • HISTONE_PATH : Path to Histone channel

  • GFAP_PATH : Path to GFAP channel

  • S100_PATH : Path to S100 channel

  • OUTPUT_DIR: Path to output masks

  • CLASSIFICATION_table_path: Path to classification table as in 3

  • SEGMENTATION_MASKS: Path to segmentation masks

  • Endothelials Mask Generation:

To get soma, processes and whole cell masks for endothelials using GFP and RECA1 biomarkers run the following:

matlab -nodesktop -nosplash -r "endothelial_whole_brain_segmentation('DAPI_PATH','E:\50-plex\final\S1_R1C1.tif','HISTONE_PATH','E:\50-plex\final\S1_R2C2.tif','GFP_PATH','E:\50-plex\final\S1_R1C4.tif','RECA1_PATH','E:\50-plex\final\S1_R1C6.tif','OUTPUT_DIR','endothelial_OUTPUT','CLASSIFICATION_table_path','E:\50-plex\classification_results\classification_table.csv','SEGMENTATION_MASKS','data/merged_labelmask.txt')"

Arguments:

  • DAPI_PATH : Path to DAPI channel

  • HISTONE_PATH : Path to Histone channel

  • GFP_PATH : Path to GFP channel

  • RECA1_PATH : Path to RECA1 channel

  • OUTPUT_DIR: Path to output masks

  • CLASSIFICATION_table_path: Path to classification table as in 3

  • SEGMENTATION_MASKS: Path to segmentation masks

  • Microglia Mask Generation:

To get soma, processes and whole cell masks for microglia using IBA1 biomarkers run the following:

matlab -nodesktop -nosplash -r "microglia_whole_brain_segmentation('DAPI_PATH','E:\50-plex\final\S1_R1C1.tif','HISTONE_PATH','E:\50-plex\final\S1_R2C2.tif','IBA1_PATH','E:\50-plex\final\S1_R1C5.tif','OUTPUT_DIR','microglia_OUTPUT','CLASSIFICATION_table_path','E:\50-plex\classification_results\classification_table.csv','SEGMENTATION_MASKS','data/merged_labelmask.txt')"

Arguments:

  • DAPI_PATH : Path to DAPI channel

  • HISTONE_PATH : Path to Histone channel

  • IBA1_PATH : Path to IBA1 channel

  • OUTPUT_DIR: Path to output masks

  • CLASSIFICATION_table_path: Path to classification table as in 3

  • SEGMENTATION_MASKS: Path to segmentation masks

  • Neuronal Mask Generation:

To get soma, processes and whole cell masks for neurons using NeuN and MAP2 biomarkers run the following:

matlab -nodesktop -nosplash -r "neuron_whole_brain_segmentation('DAPI_PATH','E:\50-plex\final\S1_R1C1.tif','HISTONE_PATH','E:\50-plex\final\S1_R2C2.tif','NeuN_PATH','E:\50-plex\final\S1_R2C4.tif','MAP2_PATH','E:\50-plex\final\S1_R5C9.tif','OUTPUT_DIR','neuron_OUTPUT','CLASSIFICATION_table_path','E:\50-plex\classification_results\classification_table.csv','SEGMENTATION_MASKS','data/merged_labelmask.txt')"

Arguments:

  • DAPI_PATH : Path to DAPI channel

  • HISTONE_PATH : Path to Histone channel

  • NeuN_PATH : Path to NeuN channel

  • MAP2_PATH : Path to MAP2 channel

  • OUTPUT_DIR: Path to output masks

  • CLASSIFICATION_table_path: Path to classification table as in 3

  • SEGMENTATION_MASKS: Path to segmentation masks

  • Oligodendrocyte Mask Generation:

To get soma, processes and whole cell masks for oligodendrocytes using Olig2 and CNPase biomarkers run the following:

matlab -nodesktop -nosplash -r "oligodendrocytes_whole_brain_segmentation('DAPI_PATH','E:\50-plex\final\S1_R1C1.tif','HISTONE_PATH','E:\50-plex\final\S1_R2C2.tif','OLIG2_PATH','E:\50-plex\final\S1_R1C5.tif','CNPASE_PATH','E:\50-plex\final\S1_R5C4.tif','OUTPUT_DIR','oligodendrocytes_OUTPUT','CLASSIFICATION_table_path','E:\50-plex\classification_results\classification_table.csv','SEGMENTATION_MASKS','data/merged_labelmask.txt')"

Arguments:

  • DAPI_PATH : Path to DAPI channel
  • HISTONE_PATH : Path to Histone channel
  • OLIG2_PATH : Path to Olig2 channel
  • CNPASE_PATH : Path to CNPase channel
  • OUTPUT_DIR: Path to output masks
  • CLASSIFICATION_table_path: Path to classification table as in 3
  • SEGMENTATION_MASKS: Path to segmentation masks

Expected run time on a "normal" desktop computer

The sample 50_plex dataset in a windows machine with the following configuration:

  • 32 core Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz
  • NVIDIA GeForce GTX 1080 Ti
  • 256 GB RAM

The run time for each module is:

  • RECONSTRUCTION: 6 hours
  • DETECTION: 4 hours
  • CLASSIFICATION: 2 hours
  • PHENOTYPING: 1 hour
  • MORPHOLOGICAL MASKING: 4 hours

Reference

If you found this implementation useful in your work, please cite the paper:

@article{maric2021whole,
  title={Whole-brain tissue mapping toolkit using large-scale highly multiplexed immunofluorescence imaging and deep neural networks},
  author={Maric, Dragan and Jahanipour, Jahandar and Li, Xiaoyang Rebecca and Singh, Aditi and Mobiny, Aryan and Van Nguyen, Hien and Sedlock, Andrea and Grama, Kedar and Roysam, Badrinath},
  journal={Nature communications},
  volume={12},
  number={1},
  pages={1--12},
  year={2021},
  publisher={Nature Publishing Group}
}