ekg/seqwish

terminate called after throwing an instance of 'std::out_of_range'

Closed this issue · 1 comments

Hi,

I have a gfa (GFA1) file made by cuttlefish (https://github.com/COMBINE-lab/cuttlefish) and now I want to "bluntify" this gfa file. As suggested on https://github.com/ekg/gimbricate, I use:

gimbricate -g 10_sars-cov-2.gfa -n -f tmp.fa -p tmp.paf > tmp.gfa
seqwish -s tmp.fa -p tmp.paf -g tmp.seqwish.gfa -P

However, this gives me an error:

[seqwish::seqidx] 0.001 indexing sequences
[seqwish::seqidx] 0.033 index built
[seqwish::alignments] 0.033 processing alignments
terminate called after throwing an instance of 'std::out_of_range'
  what():  stol
Aborted

I am not sure whether this problem is caused by gimbricate or by seqwish, but I did notice that in the gfa output of gimbricate CIGAR strings "0M" are removed. When the CIGAR string is for example "24M", it is written as "24=".
Also, the PAF file contains extremely large (and impossible) numbers for lines where 0M strings are found in the original gfa, for example:

15867	542	518	542	-	4392	49	25	49	24	0	100	cg:Z:24=
25630	49	25	49	+	15867	542	0	24	24	0	100	cg:Z:24=
15867	542	518	542	-	4392	49	25	49	24	0	100	cg:Z:24=
25630	49	25	49	+	15867	542	0	24	24	0	100	cg:Z:24=
15867	542	518	542	-	4392	49	25	49	24	0	100	cg:Z:24=
25630	49	25	49	+	15867	542	0	24	24	0	100	cg:Z:24=
15867	542	18446744072793273373	542	+	17434	309	18446744073709551615	18446744073709518850	0	0	100	cg:Z:
6867	224	18446744072793273055	224	+	15867	542	18446744073709551615	18446744073709518850	0	0	100	cg:Z:

I use seqwish version 0.7.1 (installed using conda) and for gimbricate, I cloned the current git repository.

Hi @dirkjanvw, seqwish version 0.7.2 is available on bioconda and should avoid your issue.