PathLDM: Text conditioned Latent Diffusion Model for Histopathology

Official code for our WACV 2024 publication PathLDM: Text conditioned Latent Diffusion Model for Histopathology. This codebase builds heavily on CompVis/latent-diffusion

Updates

💥 Check out our CVPR paper - Learned representation-guided diffusion models for large-image generation , where we train histopathology diffusion models without labeled data.

Requirements

To install python dependencies,

conda env create -f environment.yaml
conda activate ldm

Downloading + Organizing Data

tl;dr : TCGA-BRCA Image patches, captions and Tumor/TIL probabilities used in our training can be downloaded from this link. See this file for the Dataset class we use during training.

We obtained machine readable text reports for TCGA from this repo, and used GPT-3.5 to summarize them. Summaries of all BRCA reports can be found at this link.

Obtaining Tumor and TIL probabilities

We used wsinfer to obtain tumor and TIL probabilities. Wsinfer works directly with the WSI files, and outputs a csv with the probabilities for each patch, but the size and magnification might be different from the patches extracted by DSMIL. For each 10x patch, we use the average probabilities of the overlapping patches from wsinfer.

Download the WSIs

We used the DSMIL repository to extract 256 x 256 patches @ 10x magnification, resulting in 3.2 million patches for TCGA-BRCA. The following steps are borrowed from the DSMIL repository.

From GDC data portal. You can use GDC data portal with a manifest file and configuration file. The raw WSIs take about 1TB of disc space and may take several days to download. Please check details regarding the use of TCGA data portal. Otherwise, individual WSIs can be download manually in GDC data portal repository

Prepare the patches

Once you clone the DSMIL repository, you can use the following command to extract patches from the WSIs.

$ python deepzoom_tiler.py -m 0 -b 10

Pretrained models

We provide the following trained models

Conditioning network	Conditioning type	Modality	FID	Link
Class embedder	Tumor + TIL	Class label (4 classes)	29.45	link
OpenAI CLIP	Report + tumor + TIL	Text caption (154 tokens)	10.64	link
PLIP	Report + tumor + TIL	Text caption (154 tokens)	7.64	link

Training

To train a diffusion model, create a config file similar to this and create / update the corresponding dataloader (ex this). To download frozen VAEs, follow instructions in the original LDM repo.

Example training command :

python main.py -t --gpus 0,1 --base configs/latent-diffusion/text_cond/plip_imagenet_finetune.yaml

Sampling

This notebook shows how to sample from the text conditioned diffusion model.

BibTeX

@InProceedings{Yellapragada_2024_WACV,
    author    = {Yellapragada, Srikar and Graikos, Alexandros and Prasanna, Prateek and Kurc, Tahsin and Saltz, Joel and Samaras, Dimitris},
    title     = {PathLDM: Text Conditioned Latent Diffusion Model for Histopathology},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2024},
    pages     = {5182-5191}
}

cvlab-stonybrook/PathLDM