This repository contains code for the paper "MultiCOP: An Integrative Analysis of Microbiome-Metabolome Associations"
We introduce the MultiCOP algorithm, a method designed for the efficient integration of microbiome and metabolome data. Its primary objective is to reveal microbe-metabolite interactions and pinpoint pertinent microbes and metabolites by leveraging correlation pursuit combined with random projection. Additionally, the MultiCOP algorithm is versatile, and capable of investigating associations between any two high-dimensional datasets to identify relevant features.
The Taxon Set Enrichment Analysis (TSEA) is then applied to directly investigate whether the selected microbes showcase enrichments within taxon sets functionally related to the microbiome-metabolite interaction.
MultiCOP requires two data tables in matrix form as input, denoted as X and Y, each with dimensions of n_sample by n_feature. The instructions for implementing MultiCOP are available in example.md. The tutorial shows how to implement the first scenario in simulation.
The function is built on R version 4.1.1. The requirement.txt file lists all the packages the notebook depends on. You can use the following command to check your R version.
R.version
The original dataset of Inflammatory bowel disease (IBD) is available here.
The original dataset of Chronic Ischemic Heart Disease (CIHD) is available here.
- Zhong, Wenxuan, et al. "Correlation pursuit: forward stepwise variable selection for index models." Journal of the Royal Statistical Society Series B: Statistical Methodology 74.5 (2012): 849-870.
- Chong, J., Liu, P., Zhou, G., Xia, J.: Using microbiome analyst for comprehensive statistical, functional, and meta-analysis of microbiome data. Nature protocols 15(3), 799–821 (2020).