PERsonalized single-Cell Expression-based Planning for Treatments In ONcology (PERCEPTION)
We build a precision oncology computational approach capitalizes on recently published matched bulk and single-cell (SC) transcriptome profiles of large-scale cell-line drug screens to build treatment response models from patients' SC tumor transcriptomics. The general objective of this project is to utilize single-cell omics from patients tumor to predict response and resistance. The following figure describe the architecture of PERCEPTION pipeline.
This section will describe how to reproduce the figures described in the following manuscript. Section 1: How to reproduce the Figures from the PERCEPTION manuscript?
Sinha, S., Vegesna, R., Mukherjee, S., Kammula, A., Dhruba, S.R., Wu, W., Kerr, L., Nair, N., Jones, M., Yosef, N., Stroganov, O., Grishagin, I., Aldape, K., Blakely, C., Jiang, P., Thomas, C., Benes, C., Bivona, T., Schäffer, A. and Ruppin, E. (2023). "PERCEPTION: Predicting patient treatment response and resistance via single-cell transcriptomics of their tumors".
The code to generate the Figures is in R and is presented as a set of R markdown(.Rmd) files to be run in Rstudio. The data files are in a mixture of RDS, csv, tsv, and other formats. Because we want to have version control and quickly do updates, the code is in a GitHub repository. Because the data files are large and because journals require a frozen snapshot of data, the data files are in a Zenodo repository. To reproduce the results one must clone the GitHub repository and download the files from the Zenodo repository into a common directory on a computer running Linux or MacOS. We have tested the following instructions on one computer of each type.
Please perform the following steps to reproduce the Figures from the manuscript:
Step 1: Clone/Download this repository using the command
git clone https://github.com/ruppinlab/PERCEPTION
This git clone command should create a directory called PERCEPTION that contains this README.md file and two subdirectories called Tools and Figures. The Tools subdirectory contains the .Rmd files to be run. The Figures subdirectory contains the Figures.
Steps 2: Run the following four unix commands
mkdir Figures_expected
cd Figures
cp * ../Figures_expected
cd ..
These commands copy the figures provided into a new folder called Figures_expected, so that if you run the Figure generating commands and thereby replace the Files in the Figures subdirectory, you can subsequently compare the newly generated files in Figures to the expected files in Figures_expected.
Step 3: Run
mkdir Data
This command creates a Data subdirectory that is parallel to the Figures and Tools subdirectories. The Data subdirectory will contain the data files, which you will download from Zenodo in the next step.
Step 4: Download the data files from the following Zenodo repository https://doi.org/10.5281/zenodo.7860559 into the Data subdirectory. Due to the limitations of Zenodo, you must download one file at a time. The data files expected are listed near the bottom of this file.
Step 5: While in the Data directory,
run
unzip Supp_COhen_IdoAmit_etal.zip
cd ../Tools
Step 6: In each of
Step2_Figure1.Rmd
Step3_Figure3.Rmd
Step5_Figure5_and_Extended_Figure9_10.Rmd
Step6_ExtendedFigure4-6.Rmd
Step7_ExtendedFigure1.Rmd
Step8_Figure6.Rmd
you need to edit the file once manually to set the "working_directory" to the full path to the your repository PERCEPTION. e.g. working_directory='/Users/sinhas8/PERCEPTION/'
Step 7: From the PERCEPTION directory open RStudio You may need to run setwd() to set the working directory.
Step 8: "Step0A" comprises a list of libraries from line 372-399 that are required to run the following scripts. Please install them in your Rstudio environment. [Please refer to this to install a new library: https://r-coder.com/install-r-packages/]
Step 9: Using the File pull-down menu in Rstudio, load and run in succession: Tools/Step2_Figure1.Rmd; Tools/Step3_Figure3.Rmd; Tools/Step5_Figure5_and_Extended_Figure9_10.Rmd; Tools/Step6_ExtendedFigure4-6.Rmd; Tools/Step7_ExtendedFigure1.Rmd; Step8_Figure6.Rmd. The filenames are chosen so that "StepN" represents the order of the results are produced and presented in the manuscript.
Step 10 (Optional, open-ended): Using whatever criteria you wish compare the newly generated figures in the Figures subdirectory to the manuscript figures in the Figures_expected subdirectory
Notes on the structure of repository:
- Each script aims (Except Step0A and Step0B) to reproduce at least one figure from either main text or supplementary.
- Every script is self-contained.
- The scripts other than Step0A and Step0B and Step1 can be run in any order.
- The two scripts starting with "step0" provides the necessary functions to run the rest of the scripts and are accordingly called in the beginning of each scripts.
- "Step1" is essential to build the PERCEPTION model. By default, this step uses the pan-cancer to build the response model for 44 drugs. However, users can modify this script to change the ‘cancer type’ to build the model for individual cancer types and also can run the PERCEPTION model for specific drugs. However, we found that the pan-cancer model shows better performances than the model built based on individual cancer types.
Scripts to expect from Github repo: A folder name "Tools" comprising following files:
step0A_functions_needed.Rmd
step0B_functions_needed.Rmd
Step1_Build_Perception_models.Rmd
Step2_Figure1.Rmd
Step3_Figure3.Rmd
Step5_Figure5_and_Extended_Figure9_10.Rmd
Step6_ExtendedFigure4-6.Rmd
Step7_ExtendedFigure1.Rmd
Step8_Figure6.Rmd
Files to get from zenodo and place in the Data subdirectory:
carfilzomib_lenalidomide_Bulk_models_list.RDS
carfilzomib_lenalidomide_model.RDS
carfilzomib_lenalidomide_model_bulk_rf.RDS
DepMapv12.RDS
EGFR_WT_signature.csv
EGFR_WT_signature_PDC.csv
FDA_approved_drugs_models.RDS
genes_across_scRNA_datasets_ofInterest.RDS
genesUsed_toBuild.RDS
lung_killing_n_abundance_clone_keyDrugs.RDS
lung_tSNE (1).txt
Maynard_Supp2_Demo.xlsx
model_performances.RDS
Predicted_viability_ScreenNishanthApr17.RDS
PRJNA591860.RDS
PRJNA591860_patient_level_killing.RDS
PRJNA591860_sample_cell_names.RDS
Responde_models_usedin_PRJNA591860_lung.RDS
Response_model_nutlin3.RDS
Ruppin120321.xlsx
subClone_id_PRJNA591860.RDS
summary_AUC.tsv
summary_combAUC.tsv
Supp_COhen_IdoAmit_etal.zip
Suppl_table_processed_drugresponse.xlsx
TableS5.csv
TableS5 copy.csv
Model_Perforamnces_PatientCohorts.RDS
To provide guidance to the users on how PERCEPTION could be used for new clinical trial datasets, we have generated a new script named “Running_PERCEPTION_for_new_dataset.Rmd”. This script will provide the fundamental functionality of PERCEPTION in predicting treatment response using a patient's scRNA-seq expression data for particular drugs or combination therapies. The following figure describe how PERCEPTION could be utilized to predict treatment response. Section 2: How to utilize PERCEPTION for a new clinical trial dataset with sc-expression?
It should be emphasized that the application of PERCEPTION is not limited to the aforementioned script. Researchers can utilize the PERCEPTION model to explore various aspects of their studies. The scripts (Step2_Figure1.Rmd , Step3_Figure3.Rmd, Step5_Figure5_and_Extended_Figure9_10.Rmd, Step6_ExtendedFigure4-6.Rmd, Step7_ExtendedFigure1.Rmd; Step8_Figure6.Rmd) described in the Section 1 has already provided examples of this, where we employed PERCEPTION to analyze three different clinical trial datasets for different cancer types and addressed several aspects associated with individual datasets.
To run the PERCEPTION on the new clinical trial dataset, please use Tools/Running_PERCEPTION_for_new_dataset.Rmd. The guidance for using "Running_PERCEPTION_for_new_dataset.Rmd" script.
The two main steps are Step1 and Step2 in this script, where attention is needed. In Step1 (lines 32-38), the users need to upload their own files; in Step2 (line 50), the users need to provide the name of their drug of interest for which they want to build the PERCEPTION model and predict the treatment response.
The two main steps are Step1 and Step2 in this script, where attention is needed. In Step1 (lines 32-38), the users need to upload their own files; in Step2 (line 50), the users need to provide the name of their drug of interest for which they want to build the PERCEPTION model and predict the treatment response.
We provided an example of how the script utilizes sample datasets to build the PERCEPTION model for two drugs and subsequently predict treatment responses in a particular type of cancer.
We have tested this code for a demo dataset; the authors should use this example to format their input files. In Step1 (lines 32-38) of Running_PERCEPTION_for_new_dataset.Rmd, the example input files are provided. The authors should follow these file types to build their own datasets. Basically, three input files are needed as input (i) single-cell gene expression matrix from patients, (ii) cell names in each patient, and (iii) patient demographics, including clinical response data. How to generate the input files for the new dataset?
We used the PRJNA591860 dataset as an example/demo. We are unable to upload these files in GitHub due to restrictions on file size. Please download the PRJNA591860.RDS and PRJNA591860_sample_cell_names.RDS from the Zenodo repository Where is the test dataset available?https://doi.org/10.5281/zenodo.7860559 and placed these into the Data directory. If you already downloaded all the files from Zenodo and placed them in the Data directory; then you don't need to download these two files again. PRJNA591860.RDS contains the single-cell gene expression matrix and PRJNA591860_sample_cell_names.RDS contains the cell names in each patient.
For patient demographics and clinical response data, we have uploaded a sample file named Sample_data_response.xlsx in the Test/Test_data directory. Please download this file in your Data directory.
The Step2 of this code (Running_PERCEPTION_for_new_dataset.Rmd) demonstrated how the users could use it to build the response model for specific drugs. We have used the example of two drugs, dabrafenib and erlotinib. The users could select any combination of drugs from the list of 44 drugs to build their own response model. We have described it within the code. How to build the model and run the code for specific drugs?
This script identifies the major cancer cell clusters in the patient's tumor using their sc-expression (transcriptional clone). By computing the mean expression of each transcriptional clone and providing it as an input to the drug response model, this script predicts the drug response of each transcriptional clone separately. Considering that the most resistant clone to the drug will likely get selected by the treatment, this script predicts the overall patient’s response as the predicted response of the most resistant clone. What are the expected outputs of "Running_PERCEPTION_for_new_dataset.Rmd" script?
The figures generated using the sample dataset are provided in the Test/Test_figures directory.
Contact: Sanju Sinha (sanju@terpmail.umd.edu)
Cancer Data Science Laboratory (CDSL), National Cancer Institute (NCI), National Institutes of Health (NIH), Bethesda, Maryland, USA