/PDAC

Primary LanguageR

Whole transcriptome digital spatial profiling of pancreatic cancer

Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal and treatment-refractory cancer. Molecular stratification in pancreatic cancer remains rudimentary and does not yet inform clinical management or therapeutic development. We construct a high-resolution molecular landscape of the multicellular subtypes and spatial communities that compose PDAC using single-nucleus RNA-seq and whole-transcriptome digital spatial profiling (DSP) of 43 primary PDAC tumor specimens that either received neoadjuvant therapy or were treatment-naïve. We uncovered recurrent expression programs across malignant cells and fibroblasts, including a newly-identified neural-like progenitor malignant cell program that was enriched after chemotherapy and radiotherapy and associated with poor prognosis in independent cohorts. Integrating spatial and cellular profiles revealed three multicellular communities with distinct contributions from malignant, fibroblast, and immune subtypes: classical, squamoid-basaloid, and treatment-enriched. Our refined molecular and cellular taxonomy can provide a framework for stratification in clinical trials and serve as a roadmap for therapeutic targeting of specific cellular phenotypes and multicellular interactions.

In this repository, we present the analysis conducted for the whole transcriptome DSP experiments.

Our manuscript (in press at Nature Genetics) will be available soon. The preprint can be found here. Raw and processed data can be found at GEO under accession number GSE199102. Code for the single-nucleus RNA-seq analysis can be found here.

Data analysis

Data preprocessing

FASTQ files for DSP were aggregated into count matrices using the azorius and hydra pipeline. Normalized expression was detrended to model cell-type specific expression.

Normalized data can be found here. Detrended data can be found at GEO under accession number GSE199102.

Cell type deconvolution and program scoring

Programs were scored for each DSP sample within each ROI using ssGSEA, which were transformed using the z-score.

For each program, intra-patient dispersion of program expression across ROIs was calculated as the patient-level mean of the interquartile range (IQR) across all ROIs within each individual tumor. In contrast, inter-patient dispersion of program expression was computed as the IQR of the mean program score for each tumor. Code for this analysis can be found here.

Unsupervised hierarchical clustering was performed on all features (malignant programs, CAF programs, deconvolved immune cell type proportions, compartment areas within ROI) using the Pearson correlation distance and average linkage. Code for the immune cell type deconvolution analysis can be found here.

ssGSEA program scores can be found here.

Receptor ligand analysis

Known receptor-ligand pairs were obtained from CellPhoneDB v2.0 with potential receptor-ligand pairs quantified using the Spearman rank correlation between paired segments within the same ROI across all ROIs with said pairs. Interactions were calculated for non-self (juxtacrine) and self (autocrine) occurring within the same segment. Receptor-ligand interactions were calculated separately for untreated and CRT specimens to determine interactions that are differential between conditions. All analyses were two-sided and used a significant level of p-value <= 0.05 and were adjusted for multiple testing where appropriate using the false discovery rate.

Code for this analysis can be found here.