lh3/gfatools

Cigar strings altered after extracting subgraph

rlorigro opened this issue · 0 comments

Before using the command:

gfatools view -l 303726,303738 -r 20 Assembly-BothStrands.gfa > Assembly-BothStrands_303726-303738_r20.gfa

my GFA has the following elements:

S       5924062 GTTTCAAAAAAAAAAAAAGAGCATGGCTCTGG        RC:i:400
L       5209751 +       5924062 +       18M1D6M
L       5420387 +       5924062 +       18M1D6M

After running the command, they look like this:

S	5924062	GTTTCAAAAAAAAAAAAAGAGCATGGCTCTGG	LN:i:32	RC:i:400
L	5209751	+	5924062	+	24M	L1:i:33	L2:i:8
L	5420387	+	5924062	+	24M	L1:i:33	L2:i:8

These overlap Cigars are incorrect:

Cigar: 24M

5924062    1                                   GTTTCAAAAAAAAAAAAAGAGCATGGCTCTGG
                                               |||||||||||||||||||||||||
5420387    34 ACTGCATTCCAGCCTGGGTGATAGAGTGAGGCTGTTTCAAAAAAAAAAAAAAGAGCAT

Compared to a local alignment with Smith-Waterman:

Cigar:  5M1I19M

5924062    1                                   GTTTC-AAAAAAAAAAAAAGAGCATGGCTCTGG
                                               ||||| |||||||||||||||||||
5420387    34 ACTGCATTCCAGCCTGGGTGATAGAGTGAGGCTGTTTCAAAAAAAAAAAAAAGAGCAT