jackbibby1/SCPA

Pathway expression

Closed this issue · 2 comments

Hello, thanks for this great package.

I want to make a plot of pathway expression similar to Figure 4C in your paper, but I don't see any mention of this on the vignettes, how can I do this ?

I found the function pathway_matrices on the docs but I'm getting an error when I try to use it:

pathways = msigdbr("Homo sapiens", "H") %>% format_pathways()
z = pathway_matrices(samples=list(s1, s2), pathways=pathways)

Error in paste(gmt_files, collapse = "\n"): object 'gmt_files' not found
Traceback:
1. pathway_matrices(samples = list(s1, s2), pathways = pathways)
2. get_paths(pathways)
3. message("Generating gene sets from: \n", paste(gmt_files, collapse = "\n"), 
 .     "\n")
4. .makeMessage(..., domain = domain, appendLF = appendLF)
5. lapply(list(...), as.character)
6. paste(gmt_files, collapse = "\n")

Hi,

This was just a general bit of code that I wrote. This functionality wasn't included in the SCPA package because you would also need to run some trajectory inference method to plot it. All the code used to generate the paper figures are here and the specific code to replicate figure 4C is here on lines 82-124. It's basically just extracting the expression values for each gene of each pathway and then ordering them by the pseudotime values that I calculated earlier in the analysis here, before plotting with ggplot.

As for the pathway_matrices() function, I've just pushed an update to it (it was previously set up to take csv or gmt files, but have included the msigdbr functionality), so hopefully your code should work now. Just download the latest version of SCPA (v1.5.3) and run it again. Here's a general bit of code to get the expression data of two pathways:

library(SCPA)
library(Seurat)
library(tidyverse)
library(magrittr)

# get populations ---------------------------------------------------------
df <- readRDS("~/My Drive/example_datasets/naive_cd4.rds")
samples <- list(seurat_extract(df))

# get pathways ------------------------------------------------------------
pathways <- msigdbr::msigdbr("Homo sapiens", "H") %>% 
  format_pathways()

# generate pathway matrices -----------------------------------------------
pathway_mats <- pathway_matrices(samples = samples, 
                                 pathways = pathways,
                                 sample_names = "all_data")

# extract data for plotting -----------------------------------------------
glycolysis <- pathway_mats$all_data$HALLMARK_GLYCOLYSIS %>%
  colMeans(.) %>%
  data.frame() %>%
  set_colnames("Gly") %>%
  rownames_to_column("Cell")

apoptosis <- pathway_mats$all_data$HALLMARK_APOPTOSIS %>%
  colMeans(.) %>%
  data.frame() %>%
  set_colnames("Apop") %>%
  rownames_to_column("Cell")

gly_apop <- full_join(glycolysis, apoptosis, "Cell")

Jack

thats awesome, thanks for the quick reply !

I don't need the pseudotime, I just wanted to plot a boxplot for the pathway expression between 2 conditions, I think the pathway_matrices function should work !