Summarizing data from mpse object without changing class
etorres475 opened this issue · 9 comments
Hi Shuangbin,
In my experimental design, I sequenced some samples and sub-samples. I would like to summarize and calculate the mean abundances for those samples and sub-samples so they represent one single sample in the posterior analysis with MicrobiotaProcess. When I try to do it using dpyr on the mpse object, it is converted into a tibble data (tbl_mpse object) and then into a tbl_df. I tried using as.mpse on the resulting file but it ended in NULL (please see the screenshots attached). Is it possible to do this in some other way?
Cheers!
I am getting another error when using mp_cal_rarecurve in:
mpse12.a <- mpse12 %>% mp_rrarefy() %>% mp_cal_alpha(.abundance = RareAbundance) %>% mp_cal_rarecurve(.abundance = RareAbundance)
Data can be found at: https://www.dropbox.com/s/eeblzk2uqggyv6o/mpse12.obj?dl=0
you can refer to the related issue and answer
In my experimental design, I sequenced some samples and sub-samples. I would like to summarize and calculate the mean abundances for those samples and sub-samples so they represent one single sample in the posterior analysis with MicrobiotaProcess. When I try to do it using dpyr on the mpse object, it is converted into a tibble data (tbl_mpse object) and then into a tbl_df. I tried using as.mpse on the resulting file but it ended in NULL (please see the screenshots attached). Is it possible to do this in some other way?
I think you can use mp_aggregate
to do it.
library(MicrobiotaProcess)
data(mouse.time.mpse)
mouse.time.mpse %>% mp_aggregate(.group=time)
you can refer to the related issue and answer
Thanks a lot for your help. I want to know which Phyla are differentially expressed, not the phyla to which the biomarkers belong, so I need to specify that I want to calculate the mp_diff_analysis in the Phyla.
I am trying the following:
b.ptt.16 <- b.mpse16 %>% mp_rrarefy() %>% mp_diff_analysis(.abundance=RareAbundance, .group=Plants_grown_in_soil, p.adjust = "fdr",first.test.alpha = 0.05,filter.p = "pvalue", taxa.class = "Phylum") %>% mp_extract_taxatree()
But I get the following warning:
What am I doing wrong?
-
I want to know which Phyla are differentially expressed, not the phyla to which the biomarkers belong, so I need to specify that I want to calculate the mp_diff_analysis in the Phyla.
First, you can extract the differential result (taxatree
slot, a treedata
object ).
b.mpse16 %>% mp_rrarefy() %>% mp_diff_analysis(.abundance=RareAbundance, .group=Plants_grown_in_soil, p.adjust = "fdr",first.test.alpha = 0.05,filter.p = "pvalue") %>% mp_extract_taxatree() -> daa.td
The daa.td
is a treedata
object which can be used ggtree
, treeio
, tidytree
and ggtreeExtra
directly. You can find nodeClass
contains different taxonomy.
> daa.td %>% dplyr::pull(nodeClass) %>% unique
[1] "OTU" "Root" "Kingdom" "Phylum" "Class" "Order" "Family"
[8] "Genus" "Speies"
Then, you can use filter
of dplyr
to choose the specified taxa.
> daa.td %>% dplyr::filter(nodeClass == 'Phylum' & !is.na(Sign_Plants_grown_in_soil), keep.td=F)
# A tibble: 1 × 12
node label isTip nodeClass nodeDepth RareAbundanceBySample LDAupper LDAmean
<int> <chr> <lgl> <chr> <dbl> <list> <dbl> <dbl>
1 2584 p__Aci… FALSE Phylum 2 <tibble [45 × 5]> 4.19 4.15
# ℹ 4 more variables: LDAlower <dbl>, Sign_Plants_grown_in_soil <chr>,
# pvalue <dbl>, fdr <dbl>
-
mouse.time.mpse %>% mp_aggregate(.group=time)
I got this:
Please re-install MicrobiotaProcess
by remotes::install_github(YuLab-SMU/MicrobiotaProcess)
.
It worked perfectly! Thanks a lot! :)
The only remaining error I keep getting is this:
I am getting another error when using mp_cal_rarecurve in:
mpse12.a <- mpse12 %>% mp_rrarefy() %>% mp_cal_alpha(.abundance = RareAbundance) %>% mp_cal_rarecurve(.abundance = RareAbundance)
Data can be found at: https://www.dropbox.com/s/eeblzk2uqggyv6o/mpse12.obj?dl=0
I checked and apparently, something is going on after filtering the features with low abundance (last step in the following command):
otuqzafile_12 <- "./its_12/table_its_12_filtered.qza"
taxaqzafile_12 <- "./its_12/taxonomy-its-12.qza"
mapfile_12 <- "./its_12/sample-metadata_its_12.txt"
mpse12 <- mp_import_qiime2(otuqza=otuqzafile_12, taxaqza=taxaqzafile_12, mapfilename=mapfile_12)
taxa_12 <- taxonomy(mpse12)
taxa_12[apply(taxa_12, 2, function(i)grepl("un", i))] <- NA
taxa_12 <- apply(taxa_12, 2, function(i)gsub("*._", "", i))
taxonomy(mpse12) <- taxa_12
rownames(mpse12) <- paste0('ASV', seq_len(nrow(mpse12)))
mpse12 %>% mp_extract_feature(addtaxa=T) %>% dplyr::pull(Phylum) %>% unique()
mpse12 <- mpse12 %>% dplyr::filter(!Phylum %in% c("p__Fungi_phy_Incertae_sedis", "p__un_k__Fungi"))
mpse12 <- mpse12 %>% mp_filter_taxa(.abundance=Abundance, min.abun=0, min.prop=.1)
If I skip this last step I can calculate my rarefaction curve.
Thanks a lot!