Incorrect GFF location with sequence that is 100% CRISPR
fangly opened this issue · 2 comments
Miriam Shiffman has reported a problem when running Prokka. With Minced 0.1.5, I could trace back the problem to a contig that is covered at 100% by CRISPRs.
minced -gff troublesomeContig.fa results.gff
troublesomeContig.fa is:
707_L1_merged_contig_534811
GTCGCCCCTCACGCAGGGGCGTGAGTTGAAATGGTTCCTTAGCCATCACGCACCCACCTC
CGCAACATGTCGCCCCTCACGCAGGGGCGTGGGTTGAAATTTAACTTGCGTTTCCAGCAT
CACCGGTTTCTGCGCGTCGCCCCTCACGCAGGGGCGTGAGTTGAAATGGCCTGCGGGGAG
GTGATGCCGCATGATCGTAAGCAGTCGCCCCTCACGCAGGGGCGTGAGTTGAAATTGCTC
GCGAACATGCGCCGCCTGTAAATACTCCCGGTCGCCACTCACGCAGGGGCGTGAGTTGAA
AT
results.gff is:
gff-version 3
707_L1_merged_contig_534811 minced:0.1.5 CRISPR 1 303 5 . . ID=CRISPR1
The contig is 302 bp long, but the reported end of the CRISPR region is 303 (i.e. beyond the end of the contig). This is what causes trouble in Prokka. I cannot investigate further and dig in the code at the moment. It will have to wait until next week (or until Connor fixes this :)
Florent
Fixed in version 0.1.6
This resolves my bug too:
tseemann/prokka#21
I've updated my version and requirements:
tseemann/prokka@e4dfd70