sherrillmix/taxonomizr

Exclude species with no rank

Closed this issue · 1 comments

Hi,

is it possible to discriminate real species from subspecies, which do not have a rank ?
For example :
TAXID : 634452 is Acetobacter pasteurianus IFO 3283-01, no rank according to NCBI
TAXID: 438 : Acetobacter pasteurianus

But in the resulting taxon table both have the same species names and I am trying to exclude the first one from results.

Best, Michael

I'm not sure the exact definition of real species versus subspecies but I guess getRawTaxonomy might get you at least part of the way there. For example:

> taxa=taxonomizr::getRawTaxonomy(c(438,634452),'accessionTaxa.sql')
>print(taxa)
$`   438`
                   species                      genus 
"Acetobacter pasteurianus"              "Acetobacter" 
                    family                      order 
        "Acetobacteraceae"         "Rhodospirillales" 
                     class                     phylum 
     "Alphaproteobacteria"           "Proteobacteria" 
              superkingdom                    no rank 
                "Bacteria"       "cellular organisms" 

$`634452`
                               no rank                                species 
"Acetobacter pasteurianus IFO 3283-01"             "Acetobacter pasteurianus" 
                                 genus                                 family 
                         "Acetobacter"                     "Acetobacteraceae" 
                                 order                                  class 
                    "Rhodospirillales"                  "Alphaproteobacteria" 
                                phylum                           superkingdom 
                      "Proteobacteria"                             "Bacteria" 
                             no rank.1 
                  "cellular organisms"
> isRealSpecies=sapply(taxa,function(xx)names(xx)[1]=='species')
>print(isRealSpecies)
   438 634452 
  TRUE  FALSE

That assumes the lowest rank of a "real species" is species while all other taxa are not "real species". That seems reasonableish at first glance but it wouldn't surprise me if there was some funny business somewhere in the taxonomy so be a bit careful with that.