undef phylum
XClaws opened this issue · 3 comments
Dear Team,
I am trying to use the blobtools for my assembly content check.
I have run the blastn and most of my contigs has a hit with taxid and I am sure they are in right order for down-stream analysis of blobtool.
However, I found that all the contigs in the blobDB.table.txt file have undef in the pythlum column.
Why did this happen? Could you please help me to fix it?
Thank you!
From Laetsch and Blaxter, 2017
[..] three non-canonical taxonomic annotations are possible:
‘no-hit’, the suffix ‘-undef’ and ‘unresolved’. Sequences not
assigned to any taxonomic group, or not passing the --min_score
threshold, are labelled ‘no-hit’. If a NCBI TaxID has no explicit
parent at a taxonomic rank, the suffix ‘-undef’ is appended to the
next upper taxonomic rank for which one does exist. In cases where
the score difference between the best and second-best hits is smaller
than --min_diff, sequences are labelled as ‘unresolved’.
Seems like your organism has no phylum in the NCBI taxonomy database.
cheers,
dom
Dear Dom,
I think you are right. Although my organism is in the nodeDB.txt, it seems it has no taxid in nt database..
Thank you!
Hi XClaws,
No worries. It is not as uncommon as one might think. You can check the taxonomy lineage for your organism on https://www.ncbi.nlm.nih.gov/taxonomy ...
And then just generate plots/tables using the taxonomic ranks that makes sense for your organism(s)...
cheers,
dom