/OCDPI

Primary LanguagePython

OCDPI DOI

Code for 'Predicting prognoses and therapy responses in ovarian cancer patients from histopathology images using graph deep learning: a multicenter retrospective study'

flowchart


Dataset

  • TCGA, we incorporate TCGA-OV cohort into our study, and its open access to all.
  • PLCO, the ovarian cancer of PLCO data was used in this study. If anyone wants to obtain PLCO data, please initiate an application on the official website.
  • HMUCH, HMUCH is available from the corresponding author upon reasonable request.

datasets

  • clinical_data: Clinical information of each cohort, stored in csv format. At least three columns, id, event time and event state are required for training or obtaining evaluation results.
  • WSIs: Store whole slide images of each cohort.
  • patches: Store patches extracted from WSIs.
  • graphs: Store graph representation of WSIs.
  • gradients: Store gradients of patches in TCGA discovery cohort.

checkpoints

  • checkpoint_CTransPath: CTransPath model pretrained by CTransPath.
  • checkpoint_GDL: Graph-based deep learning (GDL) pretrained on our TCGA discovery cohort.

data_preprocessing

  • multi_thread_WSI_segmentation.py: Used to segment and filter patches from WSIs. Implemented based on histolab package.

get_patches_feature

  • ctran.py: Implementation of CTransPath.

  • get_CTransPath_features.py: Using pre-trained CTransPath to obtain histopathological features of patches.

    Part of the implementation here is based on CTransPath.

construction_OCDPI

  • utils/conceptualize_WSI_to_graph.py: Get the graph representation of WSIs and further used for the graph-based deep learning (GDL) model.
  • utils/dataset.py: Generate datasets.
  • utils/util.py: Tools and loss function used in training.
  • utils/calculate_gradient_of_patch.py: Integrated Gradients (IG)-based gradient calculation for model interpretability.
  • utils/visualisation.py: Gradient value visualization.
  • model: Implementation of GDL model.
  • train: Training the GDL model.
  • evaluation: Evaluation of the GDL model in multi-center external cohorts.

Usage

If you intend to utilize it for paper reproduction or your own WSI dataset, please adhere to the following workflow:

  1. Configuration Environment.
  2. Create a folder for your data in datasets and download or move the WSIs there.
  3. Use data_preprocessing/multi_thread_WSI_segmentation.py to segment WSIs into patches.
  4. Use get_patches_feature/conceptualize_WSI_to_graph.py to obtain representation vector of patches.
  5. Use construct_OCDPI/utils/conceptualize_WSI_to_graph.py to obtain graph representation for WSIs.
  6. Run construct_OCDPI/train.py to train the GDL model. When training is complete you can use this GDL model to calculate OCDPI (Ovarian Cancer Digital Pathology Index) from each WSI.
  7. Using construct_OCDPI/evaluation.py you can obtain the evaluation results of the survival prediction performance of the GDL model.