A Developmental Deconvolution Multilayer Perceptron for Classification of Cancer Origin

Graphical Abstract

Project Overview

Cancer is a disease manifesting in abrogation of developmental programs, and malignancies are named based on their cell or tissue of origin. However, a systematic atlas of tumor origins is lacking.

Here we map the single cell organogenesis of 56 developmental trajectories to the transcriptomes of over 10,000 tumors across 33 cancer types. We use this map to deconvolute individual tumors into their constituent developmental trajectories. Based on these deconvoluted developmental programs, we construct a Developmental Multilayer Perceptron (D-MLP) classifier that outputs cancer origin.

The D-MLP classifier (ROC-AUC: 0.974 for top prediction) outperforms classification based on expression of either oncogenes or highly variable genes. We analyze tumors from patients with cancer of unknown primary (CUP), selecting the most difficult cases where extensive multimodal workup yielded no definitive tumor type. D-MLP revealed insights into developmental origins and diagnosis for most patient tumors.

Our results provide a map of tumor developmental origins, provide a tool for diagnostic pathology, and suggest developmental classification may be a useful approach for otherwise unclassified patient tumors.

Code Overview

The code folder contains the scripts used to generate the analysis and figures shown in Moiso et al. (add url here). The scripts in the code are written in R, shell and Python and requires the following packages:

R

R version 3.5.1 or older is required with the following libraries:

And the following Bioconductor package

limma

Shell

shell scripts are used for fastq reads analysis, and require the following softwares:

STAR v2.7.1a
RSEM v1.3.1

Python

Python version 3.6.4 is used to generate and evalaute the MLP models and the following modules and libraries are required:

keras 2.2.0
numpy 1.19.5
scikit-learn 0.19.1
sys
tensorflow 1.5.0

Docker

To easily reproduce the analysis and the figures of our work we dockerized the environment we used in the paper.
This requires you to have Docker installed on your system. If you don't don't panic, it is super simple, just follow these instructions for Linux, Mac or Windows

When Docker is up and running you can clone this git repository with:

git clone https://github.com/emoiso/DevTum.git

After cloning the git you can assemble the image with following command:

cd DevTum
sudo docker build -t devtum .

After the devtum image has been succesfully build, you can test by runing:

sudo docker run --entrypoint code/figs7b.R -v $PWD:/home devtum

The previous command will generate the umap shown in figure S7 and save it in figs/paper on your system.

If you encounter any problems, bugs or have any question, please contact Enrico Moiso (em.metaminer@gmail.com).

Created and maintained by Enrico Moiso. Last update 07/11/2022.

cjgunase/DevTum