jbisanz/qiime2R

qza_to_phyloseq doesn't handle taxonomy.qza generated using Silva

Closed this issue · 7 comments

Hi @jbisanz
qza_to_phyloseq() only imports correctly taxonomy.qza generated with Greengenes db, not Silva. Sorry I don't have the object for you but the Taxon in the latter case is written as, for example:

"D_0__Bacteria;D_1__Firmicutes;D_2__Clostridia;D_3__Clostridiales;D_4__Ruminococcaceae;D_5__Ruminococcaceae NK4A214 group;D_6__uncultured bacterium"

So lines 38 - 41 in qza_to_phyloseq.R is not adapted. I can do a PR, if you like.

That would be great if you could! It needs a more elegant way of detecting the separator and handling variable length taxonomic strings, but I have not had the time to address this myself.

Jordan

If you're looking for a simple manual fix you can run:

tax <- data.frame(phyloseq::tax_table(ps)[, 1]) %>%
  mutate(Kingdom = stringr::str_replace_all(Kingdom, "D_\\d__", ""))
tax <- tax %>%
  tidyr::separate(Kingdom, c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"), sep = ";")

tax_mat <- as.matrix(tax)
rownames(tax_mat) <- phyloseq::taxa_names(ps)
phyloseq::tax_table(ps) <- tax_mat

Hi @nebfield

I'm having the same issue, I've tried using your script but I can't get it to work. I get the following error:

`> tax <- data.frame(phyloseq::tax_table(ps)[, 1]) %>%

  • mutate(Kingdom = stringr::str_replace_all(Kingdom, "D_\\d__", ""))
    

Error in phyloseq::tax_table(ps) : object 'ps' not found`

Is there something I am doing wrong/am I missing a package?

Hi, ps should be replaced with the name of your phyloseq object. Sorry for the confusion.

Thanks @nebfield, sorry I'm new to this.

Running the script:

tax <- data.frame(phyloseq::tax_table(ps)[, 1]) %>%
mutate(Kingdom = stringr::str_replace_all(Kingdom, "D_\d__", ""))

I get the following error:

Error in stri_replace_all_regex(string, pattern, fix_replacement(replacement), :
object 'Kingdom' not found

Do you know what is causing this?

It's hard to say. What's the output of

tax <- data.frame(phyloseq::tax_table(ps)[, 1]) 
colnames(tax)

Hi @nebfield

The output is:

Error in is.data.frame(x) : object 'tax' not found

Am I correct if I use the following script?

taxonomy <- read_qza("taxonomy.qza")
tax_table<-do.call(rbind, strsplit(as.character(taxonomy$data$Taxon), "; "))

tax <- data.frame(phyloseq::tax_table(tax_table)[, 1]) %>%
mutate(Kingdom = stringr::str_replace_all(Kingdom, "D_\d__", ""))

Or have I done something wrong there?