yechengxi/DBG2OLC

truncated backbone_raw.fasta

matt-shenton opened this issue · 2 comments

Hi there,

thanks a lot for this pipeline which looks really cool.

I'm having an issue where some Backbones listed in the DBG2OLC_Consensus_info.txt output are not appearing in the backbone_raw.fasta file.

(Around 1500 Backbones in the consensus info, 528 in the backbone_raw.fasta)

In the output I'm getting some messages like:

Loading contigs.
180001914 k-mers in round 1.
168517309 k-mers in round 2.
...skipping...
4715199 alignments calculated.
165 secs.
Loading non-contained sequences.
22107 loaded.
error: complement_strR
error: complement_strS
error: complement_strM
error: complement_strR
error: complement_strS
error: complement_strY
error: complement_strY
error: complement_strY
error: complement_strY
error: complement_strR
frag sum: 327727465
offset sum: 158812461
Extension warning.
Extension warning.
Extension warning.

Can you point me to the meaning of these error: complement statements?

best wishes

Matt

It looks your data is not raw sequencing data. Only ATCGs are allowed but your data has other characters such as RSMY.

Thanks a lot for your answer, sorry for taking your time.

I used pbsim to simulate data, but the reference genomes contained IUPAC ambiguity codes.

Thanks a lot for your help

Matt