/tensor_gp

Code for Tensor Gaussian Process Regression for predicting drug combination synergy, developed for the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge 2015: https://www.synapse.org/#!Synapse:syn4231880/wiki/235645

Primary LanguageStan

Code for Tensor Gaussian Process Regression for predicting drug combination synergy, developed for the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge 2015:
https://www.synapse.org/#!Synapse:syn4231880/wiki/235645

More details of the method are available here:
https://www.synapse.org/#!Synapse:syn5586109/wiki/394920
  
Required R packages: rstan, data.table, dplyr, doMC (or at least foreach), irlba (if you want to run the convex PCA component).  

Data required:
     - Both the Drug_synergy_data and Sanger_molecular_data directories must be in the main directory. ch1_LB.csv and ch2_LB.csv should be in the synergy data folder also. 
     - CCLE expression data file: CCLE_Expression_Entrez_2012-09-29.gct available from http://www.broadinstitute.org/ccle/data/browseData?conversationPropagation=begin

Key files: 
  - molecular_data.R processes the raw cell line data and produces distance matrices. It addtionally performs imputation for GE for two cell lines from CCLE and for methylation. 
  - mono_pca_all.R performs a nuclear-norm regularized PCA on the monotherapy data
  - train_gp.R This is run to produce results for all challenge tasks. 
   Rscript train_gp.R <run> <setup> <sub> <cores> <usetissue> <max_its>
  
With the full training data (we use all of Ch 1 and Ch 2 for both) the memory consumption is pretty high (like 40Gb per thread) so I only run two cores on the 120Gb machines we have. So Ch 1A was trained using: 
train_gp.R 1 final2 A 2 1 30
and predictions are made using 
train_gp.R 0 final2 A 1 1 30
Similary for Ch 1B (just change A -> B). 

For Ch 2, the training from Ch1 A can be used (pick the top likelihood seed), and then run sub2_predict.R. In principle this could be done using a Stan run, but in practice it's faster to do it manually.