MOMA: The Multi-omics Multi-cohort Assessment (MOMA) Platform

Pei-Chen Tsai, Tsung-Hua Lee, Kun-Chi Kuo, Fang-Yi Su, Tsung-Lu Michael Lee, Eliana Marostica, Tomotaka Ugai, Melissa Zhao, Mai Chan Lau, Juha P. Väyrynen, Marios Giannakis, Yasutoshi Takashima, Seyed Mousavi Kahaki, Kana Wu, Mingyang Song, Jeffrey A. Meyerhardt, Andrew T. Chan, Jung-Hsien Chiang, Jonathan Nowak, Shuji Ogino, Kun-Hsing Yu. Histopathology Images Predicted Multi-Omics Aberrations and Prognoses in Colorectal Cancer Patients. Nature Communications. 2023 Apr 13;14(1):2102. Paper

Requirements

Survival prediction
- Python==3.6.0
- tensorflow==2.4.0
- lifelines
- scipy
- statistics
- matplotlib
Multi-omics characterization
- Python==3.6.0
- torch==1.6.0
- torchvision==0.7.0
- scikit-learn
- numpy
- smooth-topk
- opencv-python
- tqdm

Dataset

Survival prediction: TCGA-COAD and TCGA-READ
Multi-omics characterization: TCGA-COAD and TCGA-READ
Interpretation: Dataset(NCT-CRC-HE-100K) provided by Kather et al

Data Preprocessing

Tiling : Modify from github Deepslide, or you can download the processed dataset provided by Kather et al.
Tumor detection : Resnet50
Color normalization : Modify from github HEnorm_python

Feature Extraction

You can use any pre-trained CNN model (like our multi-omics characterization task) or train model on our own (like our survival prediction task) to extract each patch's features.

Data Preparation

Survival Prediction

Color normalization

Create a dataframe

# Survival dataframe
data = {
    'bcr_patient_barcode' : patient id,
    'vital_status' : overall survival status or disease free status,
    'Days' : overall survival days or disease free days
    '0' : pathology image feature (dimension 1)
    '1' : pathology image feature (dimension 2)
    ...
    'n' : pathology image feature (dimension n)
}

df = pd.DataFrame(data)

Multi-omics characterization

XXX_id can be patient’s ID or slide’s ID, which is depending on your task. And please be sure that the patch_name in features pickle file and in cluster pickle file is the same.

Sample file

# Patch features pickle
{
  'patch_name' : array([latent feature]),
  'patch_name' : array([latent feature]),
  ...
}

# Cluster pickle file
{
  XXX_id: {
    'patch_name' : cluster label,
    'patch_name' : cluster label,
    ...
  },
  XXX_id: {
    'patch_name' : cluster label,
    'patch_name' : cluster label,
    ...
  },
}

# Label pickle file
{
  XXX_id: class,
  XXX_id: class,
  ...
}

Interpretation