ggonnella/gfapy

gfa-convert: custom sl:i tag ought to be LN:i

sjackman opened this issue · 5 comments

I'm guessing here that sl:i is sequence length. The standard name for this tag is LN:i
See https://github.com/GFA-spec/GFA-spec/blob/master/GFA1.md#optional-fields-2
Observed output

S	A	AAAAAAACGT	sl:i:10

Expected output

S	A	AAAAAAACGT	LN:i:10

It was LN, and I changed it to sl, as the content of the slen tag does not really need to be the length of the segment, according to the GFA2 specification. Maybe, if the sequence is available, I can change it to LN if it agrees with its length.

I never did like that bit of the GFA 2 spec. It's peculiar. I would go with LN:i all the same. If the sequence is not present, then I would use LN:i. If the sequence is present, then I wouldn't include any LN:i tag at all.

I also do not like it, I never understood the reason for that...

If the sequence is present, then I wouldn't include any LN:i tag at all.

On second thought, do you think it's helpful to always include the LN:i tag in GFA 1 output?

Yes, I think that if the information is there, one can output it.