diazlab/CONICS

extract the samples with CNV

Closed this issue · 5 comments

Thanks for the fantastic tool! I am currently using it on my single cell data.

However, I found some inconsistencies when I try to extract the cells with CNV.

First, I used this output "bin_mat=binarizeMatrix(redu,normal,tumor,0.8); cells1<-apply(bin_mat,1, sum)" . I suppose the cells1 with higher value are cells with CNV.

On the other hand, I also tried to use the heatmap generated by "plot1<plotBinaryMat(bin_mat,patients,normal,tumor,patient=patients); CNVs= cutree(plot1$tree_col,k=3)". Then I took the cluster that showed with CNVs in heatmap. Surprisingly, the cells differed greatly with the cells1 that I got above.

I feel so confused about it. Could you give some information? Which one is right or how to get the cells with CNV?

Thanks a lot!

Hey, thanks for using CONICSmat!
First idea I have that could be the issue is that binarizeMatrix will produce NAs for cells that have a posterior between 0.2 and 0.8 based on the threshold parameter (0.8). Therefore, most likely cells1<-apply(bin_mat,1, sum) will have entries which are NA. You could either use cell1->apply(bin_mat,1, sum,na.rm=T). If you do that, could you run
intersect (cells1,cells2), where cells2 are the cells you selected by cutting the tree?

A second way to avoid this would be:
bin_mat=binarizeMatrix(redu,normal,tumor,0.5). This will threshold the posterior probability on 0.5 and therefore no NAs will be generated.

Let me know if this helped.
Thanks,
Soren

If you update your CONICSmat version (you can just re-install it from git), I changed the code for plotting the binary matrix. On the bottom right it now indicates which cluster has which ID and returns a cluster assignment for each cell. The order of the clusters in the plot might have been an issue as well. Pheatmap does not order them sequentially.

cmat_plot

plot1=plotBinaryMat(bin_mat,patients,normal,tumor,patient="MGH97")

which(plot1==3)
    97_P3_A07     97_P3_C10     97_P3_B02     97_P3_B01     97_P3_H10     97_P5_C04     97_P6_H04     97_P5_C01     97_P6_A11     97_P6_F10 
           24            44            58            65           105           121           125           147           162           223 
 MGH97_P7_G11  MGH97_P7_E01 MGH97_P10_H07 MGH97_P10_B10  MGH97_P8_B05 
          276           345           438           452           535


@soerenmueller
Thanks for your prompt reply. Actually my code removed the NAs, i.e.
"extract.cells <- apply(bin_mat,1,function(i){sum(as.numeric(na.omit(i)))})" ; so I don't think it's the NAs issue.
I will re-install the package to update the function and check if it works now. Thanks a lot and will let you know then.

@wasqqdyx
I checked again today with fresh eyes and actually there was a bug in the plotBinaryMat function. Thanks for pointing me to this! The names of cells were incorrectly assigned. I fixed the bug in the current version.

which(cells1==0)
    97_P3_H11     97_P3_H03     97_P3_A11     97_P3_F01     97_P5_A04     97_P5_B06     97_P5_E07     97_P5_D04     97_P6_E04     97_P5_G10 
           24            44            58            65           105           121           125           147           162           223 
 MGH97_P7_C11  MGH97_P7_D08 MGH97_P10_G01  MGH97_P8_D10  MGH97_P9_C08 
          276           345           438           452           535 
> which(plot1==3)
    97_P3_H11     97_P3_H03     97_P3_A11     97_P3_F01     97_P5_A04     97_P5_B06     97_P5_E07     97_P5_D04     97_P6_E04     97_P5_G10 
           24            44            58            65           105           121           125           147           162           223 
 MGH97_P7_C11  MGH97_P7_D08 MGH97_P10_G01  MGH97_P8_D10  MGH97_P9_C08 
          276           345           438           452           535 

I'm closing this thread, thanks for pointing me to the issue.