jackbibby1/SCPA

combined_metabolic_pathways.csv

Yuvalyos opened this issue · 4 comments

Hey,
I copied the file from here https://jackbibby1.github.io/SCPA/articles/comparing_two_populations.html to an Excel file and saved it as CSV but I got this error

Cell numbers in population 1 = 3141
Cell numbers in population 2 = 8025
- If greater than 500 cells, these populations will be downsampled

Error in x$Genes : $ operator is invalid for atomic vectors

I saw in issue #46 that the problem might be the pathway file,
I also tried on the same data and same groups to use comparepathways with msigdbr and it worked well

pathways <- msigdbr("Homo sapiens", "H") %>%
  format_pathways() 

Can you help me please?
I attached screenshot of the pathway file in my R script

Thank you
Yuval

image

Hey,

I imagine this is due to a copy/paste and excel > csv conversion issue where excel forces certain cell formatting changes. Instead of copying the contents, if you just use the download raw file option on GitHub and this should download the csv file directly so there are no issues with formatting i.e.

Screenshot 2023-05-28 at 10 16 05

I just ran the below code using this and it works fine from my end.

library(Seurat)
library(SCPA)

# data --------------------------------------------------------------------
df <- readRDS("~/My Drive/example_datasets/naive_cd4.rds")
pathways <- "~/Downloads/combined_metabolic_pathways.csv"

# populations to compare --------------------------------------------------
p1 <- seurat_extract(df, meta1 = "Hour", value_meta1 = 0)
p2 <- seurat_extract(df, meta1 = "Hour", value_meta1 = 12)

# compare -----------------------------------------------------------------
scpa_out <- compare_pathways(samples = list(p1, p2), 
                             pathways = pathways,
                             parallel = T,
                             cores = 4)

Does that help things?

Jack

Thank you for the fast answer!
Unfortunately not, still get the same error and I downloaded the raw file and copied your code :(

Processing in parallel using 4 cores

Cell numbers in population 1 = 3141
Cell numbers in population 2 = 8025
- If greater than 500 cells, these populations will be downsampled

Error in x$Genes : $ operator is invalid for atomic vectors 

Hmm. It's a bit tough to see what you're doing without your full code. Can you send me the code that you're using for all the pathway analysis, like the end-to-end snippet I sent above? i.e. generating your populations/expression matrices > generating the pathway list > and the compare_pathways() or compare_seurat() function.

Can you also run this:

pathways <- "~/Downloads/combined_metabolic_pathways.csv" # the path to where your csv file is
test <- SCPA::get_paths(pathways)
lapply(test[1:3], function(x) head(x, 5))

And send over the output. You should get something like:

[[1]]
# A tibble: 5 × 2
  Pathway                       Genes  
  <chr>                         <chr>  
1 HALLMARK_BILE_ACID_METABOLISM SCP2   
2 HALLMARK_BILE_ACID_METABOLISM ABCD3  
3 HALLMARK_BILE_ACID_METABOLISM SLC27A2
4 HALLMARK_BILE_ACID_METABOLISM HSD3B7 
5 HALLMARK_BILE_ACID_METABOLISM HSD17B4

[[2]]
# A tibble: 5 × 2
  Pathway                        Genes
  <chr>                          <chr>
1 HALLMARK_FATTY_ACID_METABOLISM ACAA1
2 HALLMARK_FATTY_ACID_METABOLISM ACAA2
3 HALLMARK_FATTY_ACID_METABOLISM ACADL
4 HALLMARK_FATTY_ACID_METABOLISM ACADM
5 HALLMARK_FATTY_ACID_METABOLISM ACOT8

[[3]]
# A tibble: 5 × 2
  Pathway             Genes
  <chr>               <chr>
1 HALLMARK_GLYCOLYSIS PGK1 
2 HALLMARK_GLYCOLYSIS ALDOA
3 HALLMARK_GLYCOLYSIS ENO1 
4 HALLMARK_GLYCOLYSIS TPI1 
5 HALLMARK_GLYCOLYSIS PFKP 

From the error, I'm assuming there will be an issue with this part.

Jack

Thank you very much, the problem was mine - I read the file as csv (read.csv)