lch14forever/microbiomeViz

does not work for "enterotype" and my data

2533245542 opened this issue · 4 comments

Hi, I tried using "enterotype" which comes from phyloseq and it got stucked at fix_duplicate_tax() and here is the related information:
image
image

For my data, it got stuck at parsePhyloseq,
image
image
I am sorry that I could not share my data but I am happy to follow any instructions.

It turns out that enterotype data does not have the full taxonomy information:

> head(tax_table(enterotype))
Taxonomy Table:     [6 taxa by 1 taxonomic ranks]:
                 Genus             
-1               NA                
Bacteria         NA                
Prosthecochloris "Prosthecochloris"
Chloroflexus     "Chloroflexus"    
Dehalococcoides  "Dehalococcoides" 
Thermus          "Thermus" 

As compared to GlobalPatterns:

> head(tax_table(GlobalPatterns))
Taxonomy Table:     [6 taxa by 7 taxonomic ranks]:
       Kingdom   Phylum          Class          Order          Family          Genus       
549322 "Archaea" "Crenarchaeota" "Thermoprotei" NA             NA              NA          
522457 "Archaea" "Crenarchaeota" "Thermoprotei" NA             NA              NA          
951    "Archaea" "Crenarchaeota" "Thermoprotei" "Sulfolobales" "Sulfolobaceae" "Sulfolobus"
244423 "Archaea" "Crenarchaeota" "Sd-NA"        NA             NA              NA          
586076 "Archaea" "Crenarchaeota" "Sd-NA"        NA             NA              NA          
246140 "Archaea" "Crenarchaeota" "Sd-NA"        NA             NA              NA          
       Species                   
549322 NA                        
522457 NA                        
951    "Sulfolobusacidocaldarius"
244423 NA                        
586076 NA                        
246140 NA   

We currently need the taxonomic tree in the data. I will also point out a limitation that the parsers need the full dataset. If you remove Bacteroides genus, all species belonging to it should not be present to make the tree inferred complete.

Sure, thank you. I have not figured out the exact problem of my dataset. When I am trying to approach it from the other angles, I find there might be a big in annotation. In the standard example, I tried to annotate Crenarchaeota with red, and an error occurred. Here is what happenned for reproducible purpose:

library(phyloseq)
data("GlobalPatterns")
GP = GlobalPatterns

GP = transform_sample_counts(GlobalPatterns, function(otu) otu/sum(otu))
GP = filter_taxa(GP, function(x) max(x)>=0.01,TRUE)
GP = fix_duplicate_tax(GP)

tr = parsePhyloseq(GP)
p = tree.backbone(tr, size=1)
anno.data <- data.frame(node=c("Crenarchaeota"),
                        color='red', stringsAsFactors = FALSE)
> clade.anno(p, anno.data)
Error in filter_impl(.data, quo) : Result must have length 271, not 0

The parsePhyloseq function actually prepends the phylogenic level to the taxonomy name. So simply changing "Crenarchaeota" to "p__Crenarchaeota" solves the problem. I apologize for seeing this in seven months. Hopefully you figured something out...

Fixed by pull request #15