No final.hic file generated
ithacajing opened this issue · 5 comments
Hi,
Here is my command and it took almost 28 days to finish.
run-asm-pipeline.sh -m diploid -i 5000 --splitter-coarse-resolution 100000 --splitter-fine-resolution 1000 Perrie_HiC.asm.hic.p_ctg.fa merged_nodups.txt
From *.rawchrom.fasta to get FINAL.fasta, it took 26 days (super SLOW). The majority time had been used to generate alignments.txt file.
rawchrom.fasta file size is 2.1G. I am wondering if the file size is too big for the pipeline to handle.
I got *.final.asm, final.cprops, *final.assembly, *FINAL.fasta and *FINAL.assembly, however, there is no final.hic file generated.
I improved the pipeline performance by doing the following:
@Debian:~/tools/improved-3d-dna$ diff run-asm-pipeline.sh ../3d-dna/run-asm-pipeline.sh
194,195c194
< #default_merger_lastz_options="--gfextend\ --gapped\ --chain=200,200"
< default_merger_lastz_options="--gfextend\ --gapped\ --chain=200,200\ --allocate:traceback=1.99G"
default_merger_lastz_options="--gfextend\ --gapped\ --chain=200,200"
831,834c830
< # improve the performance
< # wrapped.fasta is much faster than orig_fasta
< awk -f ${pipeline}/utils/wrap-fasta-sequence.awk ${orig_fasta} > ${genomeid}.wrapped.fasta
< awk -f ${pipeline}/edit/edit-fasta-according-to-new-cprops.awk ${genomeid}.rawchrom.cprops ${genomeid}.wrapped.fasta > ${genomeid}.rawchrom.fasta
awk -f ${pipeline}/edit/edit-fasta-according-to-new-cprops.awk ${genomeid}.rawchrom.cprops ${orig_fasta} > ${genomeid}.rawchrom.fasta
Any suggestions why the final.hic file was missing, and why the alignment step was SO slow, how can we speed it up?
Thanks,
jing
Hi Olga,
Thank you so much for your explanation and advice.
Best,
Jing
Hi,
I run it with haploid mode for 8 rounds of correction and didn't get the final.hic file either? Intreastingly, I soft linked the merged_nodups.txt file from the juicer output folder, and the 3d-dna somehow modified it to an empty file. Do you know what could be the reason?
best,
Cui
thanks Olga. I somehow managed to finish the pipeline in another run, and the result looks quite satisfying!
best,
Cui