ctSkennerton/minced

Incorrect GFF location with sequence that is 100% CRISPR

fangly opened this issue · 2 comments

Miriam Shiffman has reported a problem when running Prokka. With Minced 0.1.5, I could trace back the problem to a contig that is covered at 100% by CRISPRs.

minced -gff troublesomeContig.fa results.gff

troublesomeContig.fa is:

707_L1_merged_contig_534811
GTCGCCCCTCACGCAGGGGCGTGAGTTGAAATGGTTCCTTAGCCATCACGCACCCACCTC
CGCAACATGTCGCCCCTCACGCAGGGGCGTGGGTTGAAATTTAACTTGCGTTTCCAGCAT
CACCGGTTTCTGCGCGTCGCCCCTCACGCAGGGGCGTGAGTTGAAATGGCCTGCGGGGAG
GTGATGCCGCATGATCGTAAGCAGTCGCCCCTCACGCAGGGGCGTGAGTTGAAATTGCTC
GCGAACATGCGCCGCCTGTAAATACTCCCGGTCGCCACTCACGCAGGGGCGTGAGTTGAA
AT

results.gff is:

gff-version 3

707_L1_merged_contig_534811 minced:0.1.5 CRISPR 1 303 5 . . ID=CRISPR1

The contig is 302 bp long, but the reported end of the CRISPR region is 303 (i.e. beyond the end of the contig). This is what causes trouble in Prokka. I cannot investigate further and dig in the code at the moment. It will have to wait until next week (or until Connor fixes this :)

Florent

Fixed in version 0.1.6

This resolves my bug too:
tseemann/prokka#21
I've updated my version and requirements:
tseemann/prokka@e4dfd70