vpc-ccg/haslr

Why all backbone and asm file are empty?

xiekunwhy opened this issue · 6 comments

Hi,

I used ~30X ONT data and ~300X illumina data to assemble a high heterozygous genome. All backbone and asm file are empty, why?
The several last line of the 3sr_k49_a3.log file are
traversal : contig
nb_contigs : 563614877
nb_small_contigs_discarded : 45039966
nt_assembled : 38008131973
max_length : 22004
graph simpification stats
tips removed : 367186126 + 17766381 + 1374406 + 191579
bulges removed : 2859263 + 63029 + 4006
EC removed
assembly traversal stats
time : 194384.097
assembly : 27581.812
graph construction : 166802.285

Best,
Kun

image

Hi @xiekunwhy

This is strange. My guess is that the short read were not properly assembled by Minia in the first place.
Can you please also share the size of files in the parent directory of the one you have shared?
Also, what is the expected size of the genome?

Hi,
The expected genome size is ~3.2G accorrding to FCM results, but it is a high heterouzygous genome(only heterouzygous peak in kmer curve plot).

The following is the file sizes in the parent directory, any suggestions?

image

Hi folks, any progress with this issue? I'm facing the same problem - an example (quick start) works well, but with real data (Illumina + ONT) conda and source code versions produce an empty final assembly. No errors during the compilation have raised.

checking /software/conda-modules/5.3.1/envs/haslr/bin/haslr_assemble: ok checking /software/conda-modules/5.3.1/envs/haslr/bin/minia_nooverlap: ok checking /software/conda-modules/5.3.1/envs/haslr/bin/fastutils: ok checking /software/conda-modules/5.3.1/envs/haslr/bin/minia: ok checking /software/conda-modules/5.3.1/envs/haslr/bin/minimap2: ok number of threads: 8 output directory: /scratch/vorel/job_138582.elixir-pbs.elixir-czech.cz/ver_1_out subsampling 25x long reads to /scratch/vorel/job_138582.elixir-pbs.elixir-czech.cz/ver_1_out/lr25x.fasta... done assembling short reads using Minia... done removing overlaps in short read assembly... done removing short sequences in short read assembly... done aligning long reads to short read assembly using minimap2... done assembling long reads using HASLR... done

haslr.py -t 8 -o ver_1_out -g 50m -l run1_2_merge_trim.fastq -x nanopore --aln-sim 0.80 -s Ktang_Tang_1_trim_paired_dedup_cor.fastq Ktang_Tang_2_trim_paired_dedup_cor.fastq Ktang_Tang_1_trim_unpaired_dedup_cor.fastq Ktang_Tang_2_trim_unpaired_dedup_cor.fastq

I have the same problem when trying to run a similar pipeline, does anyone has a solution ?

same here

same here, the final assembly fasta does not even exist


-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 asm.final.ann
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.01.init.gfa
-rw-r--r-- 1 hoelzerm domänen-benutzer   42 Jun 10 09:32 backbone.01.init.stat
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.02.weakEdge.gfa
-rw-r--r-- 1 hoelzerm domänen-benutzer   42 Jun 10 09:32 backbone.02.weakEdge.stat
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.03.tip.gfa
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.03.tip.log
-rw-r--r-- 1 hoelzerm domänen-benutzer   42 Jun 10 09:32 backbone.03.tip.stat
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.04.simplebubble.gfa
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.04.simplebubble.log
-rw-r--r-- 1 hoelzerm domänen-benutzer   42 Jun 10 09:32 backbone.04.simplebubble.stat
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.05.superbubble.gfa
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.05.superbubble.log
-rw-r--r-- 1 hoelzerm domänen-benutzer   42 Jun 10 09:32 backbone.05.superbubble.stat
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.06.smallbubble.log
-rw-r--r-- 1 hoelzerm domänen-benutzer   42 Jun 10 09:32 backbone.06.smallbubble.stat
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 backbone.branching.log
-rw-r--r-- 1 hoelzerm domänen-benutzer    4 Jun 10 09:32 compact_uniq.txt
-rw-r--r-- 1 hoelzerm domänen-benutzer 7.3M Jun 10 09:32 index.contig
-rw-r--r-- 1 hoelzerm domänen-benutzer  41K Jun 10 09:32 index.longread
-rw-r--r-- 1 hoelzerm domänen-benutzer    0 Jun 10 09:32 log_asmfinal.txt

but it seems the short-read assembly worked

-rw-r--r-- 1 hoelzerm domänen-benutzer  44M Jun 10 09:32 sr_k49_a3.unitigs.fa

I expect ~40Mb