ay-lab/dcHiC

error in hclust (--pcatype select)

Nico-FR opened this issue · 4 comments

Hello,

The first step Rscript dchicf.r --file input_files.txt --pcatype cis is working very well for the compartment calling.

But I am getting an error for the second step Rscript dchicf.r --file input_files.txt --pcatype select:

Error in hclust(as.dist(round(1 - cor(pc.mat), 4))) : 
  NA/NaN/Inf in foreign function call (arg 10)
Calls: pcselect -> pcselectioncore -> hclust
Execution halted

It seems that it is working for the first two samples as the stdout return:

Running  intra   1  in  poll_0197  sample
Running  intra   1  in  poll_3654  sample

Here is my input files:

Bovin-0197.ARS-UCD1.2.mapq_10.50000.txt   Bovin-0197.ARS-UCD1.2.mapq_10.50000.bed   poll_0197       poll
Bovin-3654.ARS-UCD1.2.mapq_10.50000.txt   Bovin-3654.ARS-UCD1.2.mapq_10.50000.bed   poll_3654       poll
Bovin-669.ARS-UCD1.2.mapq_10.50000.txt    Bovin-669.ARS-UCD1.2.mapq_10.50000.bed    unp_669 unp
Bovin-977.ARS-UCD1.2.mapq_10.50000.txt    Bovin-977.ARS-UCD1.2.mapq_10.50000.bed    unp_977 unp

Any idea?

So, that part of the code is trying to select the best PC out of all calculated. I guess, if one of the chromosomes has a very low correlation with either the transcription start site or GC content, it may give this error. I would first suggest to check for anomalies in the unp PC values (Probably plot them in custom R scripts). Sometimes, for unconventional genomes, you may need to hand-pick PCs for some of the chromosomes. Let me know how it goes, happy to help you out.

Ok, I did ckeck the PCs and PC1 is always the good one expected for chromosome X. I tryed without the X but same error occur.
The quality is the same for poll or unp samples, so I do not understand why it did not work for unp.

Exemple for PC1 of unp sample
image

Exemple for PC2 of unp sample
image

I also have expression datas on those individuals, I can easily select the PC and oriented them.

I will try the next step with Rscript utility/reselectpc.r --reselect ref but I do not understand how it works. How to use my own oriented bedgraph with PC1 values?

ok, my bad! My chromosomes on matrices are written as "1" and as "chr1" in UCSC golden path...
The stdout put us on the wrong track.

Added a note in the documentation to look out for this! Thank you for raising the issue.