Segmentation fault when using --lowest species
donovan-h-parks opened this issue · 3 comments
Hi,
I've run into an issue where MetaCache runs as expected using the following parameters, but crashes with "Command terminated by signal 11" when the --lowest species
flag is added:
-pairfiles -no-map -taxids -lineage -separate-cols -threads 32 -abundances profile.tsv -abundance-per species -out classification.log"
Is there a set of incompatible flags I'm using or is it possible that using the -lowest flag has uncovered a bug?
Thanks,
Donovan
Interestingly, everything works if I use -lowest subspecies
which makes me think there is a sequence that somehow has an invalid species name. I'm using the recommended RefSeq DB with the NCBI taxonomy as per the MetaCache instructions. I've noticed that NCBI does sometime have genomes with invalid Taxon ID (i.e. the NCBI taxonomy has been updated, but the associated genome data has not been updated yet). Perhaps a similar issue is happening here.
Hi Donovan! I'm not sure where bad taxonomy data could cause a segfault. Invalid taxon ids should be ignored by MetaCache. Does this error happen only with abundance output?
Can you please check if the per-read output works (dropping -no-map
) with default output / -taxids-only
?
Can I send you the data that is causing the bug? It is ~100 GB, but I can upload it to a FTP site if you can make one available.