Winnie09/Lamian

Lamian test

ylxing001 opened this issue · 6 comments

Hi Wenpin,

I am using Lamian package for pseudotime analysis between 4 different conditions (4 databases DBs).
I have a trouble with Lamian test function, is it because of different dimension of res$pseudotime (846 instead of 652)? I am not sure how to fix it.

My codes are indicated below:

str(pca@cell.embeddings)
num [1:652, 1:30] -5.271 -1.228 -1.341 -0.552 -3.717 ...

  • attr(*, "dimnames")=List of 2
    ..$ : chr [1:652] "AAAGAACAGCGGCTCT-1_1" "AAAGGTAAGCATTCAT-1_1" "AAAGGTAAGCGTTCAT-1_1" "AAAGGTAAGTGTTCAT-1_1" ...
    ..$ : chr [1:30] "PC_1" "PC_2" "PC_3" "PC_4" ...

str(Expression)
num [1:3000, 1:652] -0.168 -0.391 -0.247 -0.237 -0.313 ...

  • attr(*, "dimnames")=List of 2
    ..$ : chr [1:3000] "St18" "3110035E14Rik" "Vcpip1" "Cops5" ...
    ..$ : chr [1:652] "AAAGAACAGCGGCTCT-1_1" "AAAGGTAAGCATTCAT-1_1" "AAAGGTAAGCGTTCAT-1_1" "AAAGGTAAGTGTTCAT-1_1" ...

str(cellanno)
'data.frame': 652 obs. of 2 variables:
$ Cell: chr "AAAGAACAGCGGCTCT-1_1" "AAAGGTAAGCATTCAT-1_1" "AAAGGTAAGCGTTCAT-1_1" "AAAGGTAAGTGTTCAT-1_1" ...
$ stim: chr "DB1" "DB2" "DB3" "DB4" ...

set.seed(12345)
res = infer_tree_structure(pca = pca@cell.embeddings,
expression = Expression,
cellanno = cellanno,
origin.marker = c('Nxph1'),
number.cluster = 9,
xlab='PC_1',
ylab = 'PC_2')

str(res$pseudotime)
Named int [1:846] 1 2 3 4 5 6 7 8 9 10 ...

  • attr(*, "names")= chr [1:846] "CGGTCAGGTGTGAATA-1_4" "CGGTCAGGTGTGAATA-1_1" "CGGTCAGGTGTGAATA-1_3" "TGTAAGCGTCTTGGTA-1_4" ...

Res <- lamian.test(expr =Expression,

  •                cellanno = cellanno, 
    
  •                pseudotime =res$pseudotime, 
    
  •                test.type = 'variable', 
    
  •                permuiter = 5)
    

Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
'data' must be of a vector type, was 'NULL'

Also, I am not sure how to generate 'design' with my datasets.

Thanks for your help!

Hi @ylxing001,

Thank you for the question! The current released version of Lamian supports two types of sample covariates in the XDE test. The first type is that the sample covariate is a two-level categorical variable (e.g. DB1 versus DB2 in your example), and the second type is that the sample covariate is a continuous variable (e.g. the sample ages). The XDE test in terms of these two types of covariates has been tested and evaluated. We are actively working on the case that the categorical variable has more than two levels. In your datasets with four conditions, a quick way supported by the current version is to conduct a pairwise comparison on the conditions: DB1 vs. DB2, DB2 vs. DB3, DB3 vs DB4, etc.. To do that, you can generate the design as follows.

(Suppose there are two samples in DB1, and two other samples in DB2):

design = data.frame(intercept = rep(1,4), condition = c(1,1,0,0), stringsAsFactors = FALSE)
rownames(design) = paste0('sample', seq(1,4))
design

```        intercept condition
sample1         1         1
sample2         1         1
sample3         1         0
sample4         1         0

Please note that we need to have at least two samples in any of the conditions to support the population-level estimation.  And the row names of the **design** should be consistent with the sample names in the second column of **cellanno**. 

If you are interested in testing a continuous sample covariate, such as the sample age,  the design can be generated in this way:

design = data.frame(intercept = rep(1,4), age = c(25, 29, 38, 52), stringsAsFactors = FALSE)
rownames(design) = paste0('sample', seq(1,4))
design

        intercept age
sample1         1  25
sample2         1  29
sample3         1  38
sample4         1  52

Hope this helps!


Hi @Winnie09

Thanks for providing your information with regard to the design. Please let me know if I get this correctly:

head(design)
Cell Intercept group
1 AAAGAACAGCGGCTCT-1_1 1 1
2 AAAGGTAAGCATTCAT-1_1 1 1
3 AAAGGTAAGCGTTCAT-1_1 1 1
4 AAAGGTAAGTGTTCAT-1_1 1 1
5 AACCATGAGTCACGAG-1_1 1 1
6 AACCATGGTACTGCGC-1_1 1 1

tail(design)
Cell Intercept group
304 TTCGGTCTCTGGGCAC-1_2 1 0
305 TTGACCCTCACGGGCT-1_2 1 0
306 TTGGGCGGTGCATTAC-1_2 1 0
307 TTGTTCACACGCGCTA-1_2 1 0
308 TTTCCTCGTGACACGA-1_2 1 0
309 TTTGGAGAGATCGACG-1_2 1 0

This is the design I made for a pairwise comparison between DB1 and DB2. But I've faced another issue:

Res <- lamian.test(expr =Expression,

  •                cellanno = cellanno, 
    
  •                pseudotime =pseudotime, 
    
  •                design=design,
    
  •                test.type = 'variable', 
    
  •                permuiter = 5)
    

[1] "fitting model: overall: CovariateTest (Model 3 vs.1), or ConstantTest (Model 1) ..."
Error: $ operator is invalid for atomic vectors

How can I fix this? I am quite new to R coding.

Thanks

Hi @ylxing001,

I was running into this issue a few days ago. Try casting the desgin as a matrix with the following code:

design <- as.matrix(design) 
design2 <- design[,-1]
rownames(design2) <- design[,1]
class(design2) <- "numeric"

And then run the test with design2. That's the solution that worked for me, so let me know if that helps.

Thanks so much @grenkoca ! And @ylxing001, it seems that your design matrix is not numeric. The first column should be all 1s (as intercept) and the second column indicates the group (0 and 1 in your case). If you are still running into the same issue, I am happy to meet in zoom to discuss it if that would work for you.

Thanks @Winnie09 and @grenkoca for your help! I tried Caleb's code but still not working. It has also been brought to my attention that result[[2]] and result[[3]] returned 0 and NA values for most of cells. @Winnie09 I am happy to meet you and I will email you shortly to arrange a time for Zoom meeting. Thank you guys!

Happy to do that!