pT1-HBTG-MIDL2023

Code for publication Tumor Budding T-cell Graphs: Assessing the Need for Resection in pT1 Colorectal Cancer Patients presented at MIDL 2023.

Bibtex:

@inproceedings{studer2023tumor,
  title={Tumor Budding T-cell Graphs: Assessing the Need for Resection in pT1 Colorectal Cancer Patients},
  author={Studer, Linda and Bokhorst, John-Melle and Nagtegaal, Iris and Zlobec, Inti and Dawson, Heather and Fischer, Andreas},
  booktitle={Medical Imaging with Deep Learning},
  year={2023}
}

The pT1-HBTG Dataset

The dataset is available on Zenodo: https://zenodo.org/record/7867085

How to run: Graph Neural Network Framework using Pytorch Geometric and Pytorch-Lightning

This framework allows you to efficiently set up experiments for the graph-level classification. It is based on PyTorch, and uses the PyTorch Geometric library for building the graph datasets and GNN architectures, and Pytorch-Lightning to organize the code. Weights & Biases is used for experiment logging.

Installation

First, you need to set up your conda environment with the right python packages (for the full list of package versions see gnn-env.yml)

# create conda env
conda create --name gnn python=3.8
conda activate gnn

# install pytorch 1.11.0 with correct cuda version (check website), e.g.
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
# for cpu only for Mac
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 -c pytorch

# install pytorch geometric for cuda 11.2 (for other versions, check the website)
conda install pyg -c pyg

# install pytorch lightning
pip install pytorch-lightning==1.6.3

# install other packages
conda install -c conda-forge wandb==0.12.17
conda install seaborn==0.11.2

# torchmetrics should already be installed, if not run
conda install -c conda-forge torchmetrics

Dataset parsing

GXL format data: gxl is a type of xml format developed for graphs. The class and dataset splits can be set up in two different ways:
- Using a train.cxl, valid.cxl and test.cxl file (as for the IAMDB Graph datasets)
- Using a folder structure with {train/test/val}/{class1, class2, ...}/*.gxl.
- Using a json file that provides a cross-validation fold split, with the class labels.

Set up your experiment

To set up your experiments you have to create your own experiment file in the project folder and set up a class that inherits from Experiment in /project/experiment_template.py. In there you specify your experimental, e.g. the runner you want to use, the transformations, the performance metrics, etc.

Run your experiment

Examples:

./project/bts_experiments.py --input-folder dino-200/other_edge_fct_delaunay --output-folder results-midl/dinotypecoord --model gin_jk --experiment-name gin_jk-other_edge_fct_delaunay-dinotypecoord_addf --wandb-project BTS-MIDL-cv5-dinotypecoord --gpu-id 0 --config-json config/bts_midl.json --add-mlp-feature-csv add_mlp_features.csv

./project/bts_type_only.py --input-folder xylabel/other_edge_fct_hierarchical-cutoff-100 --output-folder results-midl/type --model graphsage --experiment-name graphsage-other_edge_fct_hierarchical-cutoff-100-type --wandb-project BTS-MIDL-cv5-type --gpu-id 2 --config-json config/bts_midl.json

For the full list of command line arguments (CLA) see util/arg_parser.py. Arguments can be either specified via the command line, a config json file, or both (parameters set in the config file overwrite the ones specified as a CLA).

Framework structure

data_modeules: Contains the PyTorch Geometric (custom) dataset loaders
model_modules: Contains the graph neural network architectures
project: Contains the experimental set-up for specific experiments
runners: Contains the pytorch_lightning.LightningModule classes for a specific task (e.g. classification)
util: contains utility functions, such as project specific argument parsers
util_scripts: contains additional utility scripts that are separate from the experimental framework

digitalpathologybern/pT1-HBTG-MIDL2023