/miic_R_package

Learning causal or non-causal graphical models using information theory

Primary LanguageRGNU General Public License v3.0GPL-3.0

MIIC

CRAN Status R build status

This repository contains the source code for MIIC (Multivariate Information based Inductive Causation), a method based on constraint-based approaches that learns a large class of causal or non-causal graphical models from purely observational data while including the effects of unobserved latent variables. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. This approach can be applied on a wide range of datasets and provide new biological insights on regulatory networks from single cell expression data, genomic alterations during tumor development and co-evolving residues in protein structures. Since the version 2.0, MIIC can in addition process stationary time series to unveil temporal causal graphs.

References

Simon F., Comes M. C., Tocci T., Dupuis L., Cabeli V., Lagrange N., Mencattini A., Parrini M. C., Martinelli E., Isambert H.; CausalXtract: a flexible pipeline to extract causal effects from live-cell time-lapse imaging data; eLife, reviewed preprint.

Ribeiro-Dantas M. D. C., Li H., Cabeli V., Dupuis L., Simon F., Hettal L., Hamy A. S., Isambert H.; Learning interpretable causal networks from very large datasets, application to 400,000 medical records of breast cancer patients; iScience, 2024.

Cabeli V., Li H., Ribeiro-Dantas M., Simon F., Isambert H.; Reliable causal discovery based on mutual information supremum principle for finite dataset; Why21 at NeurIPS, 2021.

Cabeli V., Verny L., Sella N., Uguzzoni G., Verny M., Isambert H.; Learning clinical networks from medical records based on information estimates in mixed-type data; PLoS computational biology., 2020. doi:10.1371/journal.pcbi.1007866 | code

Li H., Cabeli V., Sella N., Isambert H.; Constraint-based causal structure learning with consistent separating sets; In Advances in Neural Information Processing Systems 2019. | code

Verny L., Sella N., Affeldt S., Singh PP., Isambert H.; Learning causal networks with latent variables from multivariate information in genomic data; PLoS Comput. Biol., 2017. doi:10.1371/journal.pcbi.1005662

Prerequisites

MIIC contains R and C++ sources.

  • To compile from source, a compiler with support for c++14 language features is required.
  • MIIC imports the following R packages: ppcor, scales, stats, Rcpp

Installation

From CRAN (release):

install.packages("miic")

Or from GitHub (development):

# install.packages("remotes")
remotes::install_github("miicTeam/miic_R_package")

Quick start

MIIC allows you to create a graph object from a dataset of observations of both discrete and continuous variables, potentially with missing values and taking into account unobserved latent variables. You can find this example along others by calling the documentation of the main function ?miic from R.

library(miic)

# EXAMPLE HEMATOPOIESIS
data(hematoData)
# execute MIIC (reconstruct graph)
miic.res <- miic(
  input_data = hematoData, latent = "yes",
  n_shuffles = 10, conf_threshold = 0.001
)

# plot graph with igraph
if(require(igraph)) {
  plot(miic.res, method="igraph")
}

Documentation

You can find the documentation pages in the "man" folder, in the auto generated PDF, or use R functions help() and ?.

Authors

  • Tiziana Tocci
  • Nikita Lagrange
  • Orianne Debeaupuis
  • Louise Dupuis
  • Franck Simon
  • Vincent Cabeli
  • Honghao Li
  • Marcel Ribeiro Dantas
  • Verny Louis
  • Sella Nadir
  • Séverine Affeldt
  • Hervé Isambert

License

GPL-2 | GPL-3