Exclude species with no rank
Closed this issue · 1 comments
Hi,
is it possible to discriminate real species from subspecies, which do not have a rank ?
For example :
TAXID : 634452 is Acetobacter pasteurianus IFO 3283-01, no rank according to NCBI
TAXID: 438 : Acetobacter pasteurianus
But in the resulting taxon table both have the same species names and I am trying to exclude the first one from results.
Best, Michael
I'm not sure the exact definition of real species versus subspecies but I guess getRawTaxonomy
might get you at least part of the way there. For example:
> taxa=taxonomizr::getRawTaxonomy(c(438,634452),'accessionTaxa.sql')
>print(taxa)
$` 438`
species genus
"Acetobacter pasteurianus" "Acetobacter"
family order
"Acetobacteraceae" "Rhodospirillales"
class phylum
"Alphaproteobacteria" "Proteobacteria"
superkingdom no rank
"Bacteria" "cellular organisms"
$`634452`
no rank species
"Acetobacter pasteurianus IFO 3283-01" "Acetobacter pasteurianus"
genus family
"Acetobacter" "Acetobacteraceae"
order class
"Rhodospirillales" "Alphaproteobacteria"
phylum superkingdom
"Proteobacteria" "Bacteria"
no rank.1
"cellular organisms"
> isRealSpecies=sapply(taxa,function(xx)names(xx)[1]=='species')
>print(isRealSpecies)
438 634452
TRUE FALSE
That assumes the lowest rank of a "real species" is species
while all other taxa are not "real species". That seems reasonableish at first glance but it wouldn't surprise me if there was some funny business somewhere in the taxonomy so be a bit careful with that.