/ProgPerm

Progressive permutation for a dynamic representation of the robustness of microbiome discoveries

Primary LanguageRMIT LicenseMIT

ProgPermute

ProgPerm

Progressive permutation for a dynamic representation of the robustness of microbiome discoveries

The proposed method progressively permutes the grouping factor labels of microbiome and performs multiple differential abundance tests in each scenario. We compare the signal strength of top hits from the original data with their performance in permutations, and will observe an apparent decreasing trend if these top hits are true positives identified from the data. The method can evaluate the overall association between microbiome and the grouping factor, rank the robustness of the discovered microbes, and list the discoveries, their effect sizes, and individual abundances.


Table of Contents

Installation

  1. If you don't have "devtools" package, you need to download it by using
install.packages("devtools")
  1. Run the following codes:
library(devtools)
install_github("LyonsZhang/ProgPermute")
library(ProgPermute)
  1. Run the example codes in "Execute_ProgPermute.R" in the test folder https://github.com/LyonsZhang/ProgPerm/tree/master/test.

Example

Preparation

clear current session

closeAllConnections()
rm(list=ls())

load data

data(testdata1)

set the working path where the figures will be saved

setwd("/Users/lzhang27/Documents/ProgPermute/R")

Analyze overall association

run progressive permutation. "variable" denotes the name of binary outcome. "top_pm" must be less than the number of microbial features

results<-bin_permute_all(variable="variable",testdata=testdata1,top_pm=267,zoomn=15,alpha=0.05)

show U-Curve of number of significant features

sigloc<-plot_bin_psig(alloutputs=results,psigtitle=NULL,psigyrange=c(0,170),savepsigfile ="sigfeatures.eps", psigpicdim=c(10,7))

show the traces of Pvalues

pvloc<-plot_bin_pv(alloutputs=results,top_pm=267,pvtitle="",pvyrange=c(0,7),savepvfile ="pvfeatures.eps", pvpicdim=c(10,7))

show the scaled U-Curve

sigsum<-plot_bin_permute_sigcurve(alloutputs=results,samsize=267,lgndcol=2,psigtitle=NULL,savepsigfile="bin_sigcurve.eps",psigpicdim=c(10,7))

Calculate Fragility Index

intres<-bin_true_initial(variable="variable",testdata=testdata1,top_pm=267)
fragility<-bin_fragility(alloutputs=results,lowindx=intres$n,top_pm=50,lgndcol=2,yrange=c(0,7),pvtitle=NULL,savepvfile="locationPvfragility.eps",pvpicdim=c(15,7))

Calculate the progressive coverage

intres<-bin_true_initial(variable="variable",testdata=testdata1,top_pm=267)
coverage<-bin_progresscoverage(alloutputs=results,lowindx=intres$n,top_pm=50,lgndcol=2,pvtitle=NULL,savepvfile="Pvcoverage.eps",pvpicdim=c(15,7),estitle=NULL,saveesfile="effectcoverage.eps",espicdim=c(15,7))

Observe the distributions of Pvalues

bin_pv_distribution(alloutputs=results,lowindx=intres$n,folder="dist1",pvtitle=NULL,pvpicdim=c(7,7))
bin_pv_dist_transit(alloutputs=results,lowindx=intres$n,cutoff=0.05,folder="results1",pvtitle="",pvpicdim=c(7,7))

Identify the best features

run the full permutation

best<-bin_permute_best(variable="variable",testdata=testdata1,top_pm=50,zoomn=100,alpha=0.05)

Real hits lie outside of 95% quantile intervals of full permuted hits

cinfresults<-plot_bin_permute_best(bestoutputs=best,top_pm=50,pvtitle="Coverage plot",savepvfile="pvalue_Coverageplot.eps",pvpicdim=c(15,10),estitle="Coverage plot",saveesfile="effectsize_Coverageplot.eps",espicdim=c(15,10))

List the signed effect sizes of the discoveries

plot_bin_effectsize(bestoutputs=best,variable="variable",testdata=testdata1,estitle=NULL,saveesfile="signedeffectsize_plot.eps",espicdim=c(15,10))

Show the dot plot of each individual discovery

lapply(best$goodpvname,dotplot_bin_sig,variable="variable",testdata=testdata1,folder="individual1")

ShinyApp

You can access the App on https://biostatistics.mdanderson.org/shinyapps/ProgPerm

Contact

If you have any questions, please contact me at liangliangzhang.stat@gmail.com