extract the samples with CNV
Closed this issue · 5 comments
Thanks for the fantastic tool! I am currently using it on my single cell data.
However, I found some inconsistencies when I try to extract the cells with CNV.
First, I used this output "bin_mat=binarizeMatrix(redu,normal,tumor,0.8); cells1<-apply(bin_mat,1, sum)" . I suppose the cells1 with higher value are cells with CNV.
On the other hand, I also tried to use the heatmap generated by "plot1<plotBinaryMat(bin_mat,patients,normal,tumor,patient=patients); CNVs= cutree(plot1$tree_col,k=3)". Then I took the cluster that showed with CNVs in heatmap. Surprisingly, the cells differed greatly with the cells1 that I got above.
I feel so confused about it. Could you give some information? Which one is right or how to get the cells with CNV?
Thanks a lot!
Hey, thanks for using CONICSmat!
First idea I have that could be the issue is that binarizeMatrix will produce NAs for cells that have a posterior between 0.2 and 0.8 based on the threshold parameter (0.8). Therefore, most likely cells1<-apply(bin_mat,1, sum) will have entries which are NA. You could either use cell1->apply(bin_mat,1, sum,na.rm=T). If you do that, could you run
intersect (cells1,cells2), where cells2 are the cells you selected by cutting the tree?
A second way to avoid this would be:
bin_mat=binarizeMatrix(redu,normal,tumor,0.5). This will threshold the posterior probability on 0.5 and therefore no NAs will be generated.
Let me know if this helped.
Thanks,
Soren
If you update your CONICSmat version (you can just re-install it from git), I changed the code for plotting the binary matrix. On the bottom right it now indicates which cluster has which ID and returns a cluster assignment for each cell. The order of the clusters in the plot might have been an issue as well. Pheatmap does not order them sequentially.
plot1=plotBinaryMat(bin_mat,patients,normal,tumor,patient="MGH97")
which(plot1==3)
97_P3_A07 97_P3_C10 97_P3_B02 97_P3_B01 97_P3_H10 97_P5_C04 97_P6_H04 97_P5_C01 97_P6_A11 97_P6_F10
24 44 58 65 105 121 125 147 162 223
MGH97_P7_G11 MGH97_P7_E01 MGH97_P10_H07 MGH97_P10_B10 MGH97_P8_B05
276 345 438 452 535
@soerenmueller
Thanks for your prompt reply. Actually my code removed the NAs, i.e.
"extract.cells <- apply(bin_mat,1,function(i){sum(as.numeric(na.omit(i)))})" ; so I don't think it's the NAs issue.
I will re-install the package to update the function and check if it works now. Thanks a lot and will let you know then.
@wasqqdyx
I checked again today with fresh eyes and actually there was a bug in the plotBinaryMat function. Thanks for pointing me to this! The names of cells were incorrectly assigned. I fixed the bug in the current version.
which(cells1==0)
97_P3_H11 97_P3_H03 97_P3_A11 97_P3_F01 97_P5_A04 97_P5_B06 97_P5_E07 97_P5_D04 97_P6_E04 97_P5_G10
24 44 58 65 105 121 125 147 162 223
MGH97_P7_C11 MGH97_P7_D08 MGH97_P10_G01 MGH97_P8_D10 MGH97_P9_C08
276 345 438 452 535
> which(plot1==3)
97_P3_H11 97_P3_H03 97_P3_A11 97_P3_F01 97_P5_A04 97_P5_B06 97_P5_E07 97_P5_D04 97_P6_E04 97_P5_G10
24 44 58 65 105 121 125 147 162 223
MGH97_P7_C11 MGH97_P7_D08 MGH97_P10_G01 MGH97_P8_D10 MGH97_P9_C08
276 345 438 452 535
I'm closing this thread, thanks for pointing me to the issue.