marbl/harvest

ignoring sequences ending with the same numbers as the reference genome

Opened this issue · 1 comments

I noticed this odd little bug in parsnp v1.2. When running parsnp using the -c -d options and a reference ending with numbers, genomes that have names that are contained in the ending numbers of that reference file get excluded.

For example:
when using the reference "H4476.fasta" , the genomes 6.fasta, 76.fasta and 476.fasta get silently ignored (they are not listed in the ini files). When I rename these three genomes to bla6bla.fasta , bla76bla.fasta and bla476bla.fasta they do get included. I'm assuming this is some sort of bug in the code that excludes the reference sequence from being selected as a query genome.

hi aldertzomer,

I'm assuming this is some sort of bug in the code that excludes the reference sequence from being >selected as a query genome.

thanks for opening this issue. this is exactly what is happening. Fix for this is on the way; a temporary workaround would be to rename the query genomes, as you've suggested.