OCDPI

Code for 'Predicting prognoses and therapy responses in ovarian cancer patients from histopathology images using graph deep learning: a multicenter retrospective study'

Dataset

TCGA, we incorporate TCGA-OV cohort into our study, and its open access to all.
PLCO, the ovarian cancer of PLCO data was used in this study. If anyone wants to obtain PLCO data, please initiate an application on the official website.
HMUCH, HMUCH is available from the corresponding author upon reasonable request.

datasets

clinical_data: Clinical information of each cohort, stored in csv format. At least three columns, id, event time and event state are required for training or obtaining evaluation results.
WSIs: Store whole slide images of each cohort.
patches: Store patches extracted from WSIs.
graphs: Store graph representation of WSIs.
gradients: Store gradients of patches in TCGA discovery cohort.

checkpoints

checkpoint_CTransPath: CTransPath model pretrained by CTransPath.
checkpoint_GDL: Graph-based deep learning (GDL) pretrained on our TCGA discovery cohort.

data_preprocessing

multi_thread_WSI_segmentation.py: Used to segment and filter patches from WSIs. Implemented based on histolab package.

get_patches_feature

ctran.py: Implementation of CTransPath.
get_CTransPath_features.py: Using pre-trained CTransPath to obtain histopathological features of patches.

Part of the implementation here is based on CTransPath.

construction_OCDPI

utils/conceptualize_WSI_to_graph.py: Get the graph representation of WSIs and further used for the graph-based deep learning (GDL) model.
utils/dataset.py: Generate datasets.
utils/util.py: Tools and loss function used in training.
utils/calculate_gradient_of_patch.py: Integrated Gradients (IG)-based gradient calculation for model interpretability.
utils/visualisation.py: Gradient value visualization.
model: Implementation of GDL model.
train: Training the GDL model.
evaluation: Evaluation of the GDL model in multi-center external cohorts.

Usage

If you intend to utilize it for paper reproduction or your own WSI dataset, please adhere to the following workflow:

Configuration Environment.
Create a folder for your data in datasets and download or move the WSIs there.
Use data_preprocessing/multi_thread_WSI_segmentation.py to segment WSIs into patches.
Use get_patches_feature/conceptualize_WSI_to_graph.py to obtain representation vector of patches.
Use construct_OCDPI/utils/conceptualize_WSI_to_graph.py to obtain graph representation for WSIs.
Run construct_OCDPI/train.py to train the GDL model. When training is complete you can use this GDL model to calculate OCDPI (Ovarian Cancer Digital Pathology Index) from each WSI.
Using construct_OCDPI/evaluation.py you can obtain the evaluation results of the survival prediction performance of the GDL model.

ZhoulabCPH/OCDPI