/causality

This repository aims to provide a comprehensive collection of methods for performing causal inference, including experimental designs, statistical methods, and advanced machine learning techniques. Each method will be implemented and documented with examples.

Primary LanguagePython

Causality Documentation

This repository aims to provide a comprehensive collection of methods for performing causal inference, including experimental designs, statistical methods, and advanced machine learning techniques. Each method will be implemented and documented with examples.

Table of Contents

  1. Introduction
  2. Methods
  3. Development Plan
  4. Contributing
  5. License

Introduction

Causal inference aims to identify causal relationships between variables, going beyond simple correlations. This repository contains implementations of various causal inference methods, with examples and documentation for each method.

Methods

Experimental Methods

  1. Randomized Controlled Trials (RCTs)
  2. Field Experiments
  3. Lab Experiments
  4. Natural Experiments

Quasi-Experimental Methods

  1. Instrumental Variables (IV)
  2. Difference-in-Differences (DiD)
  3. Regression Discontinuity Design (RDD)
  4. Interrupted Time Series Analysis

Matching and Reweighting Methods

  1. Propensity Score Matching (PSM)
  2. Covariate Matching
  3. Inverse Probability Weighting (IPW)
  4. Genetic Matching
  5. Entropy Balancing
  6. Mahalanobis Distance Matching
  7. Coarsened Exact Matching (CEM)
  8. Nearest Neighbor Matching

Graphical and Structural Methods

  1. Causal Diagrams (Directed Acyclic Graphs - DAGs)
  2. Structural Equation Modeling (SEM)
  3. Path Analysis

Machine Learning and Advanced Statistical Methods

  1. Causal Forests
  2. Bayesian Causal Inference
  3. Double Machine Learning (DML)
  4. Targeted Maximum Likelihood Estimation (TMLE)
  5. Synthetic Control Method
  6. G-computation
  7. Marginal Structural Models (MSM)
  8. Causal Bayesian Networks
  9. Causal Discovery Algorithms (references: PC Algorithm, FCI Algorithm)

Time-Series and Panel Data Methods

  1. Fixed Effects Models
  2. Random Effects Models
  3. Dynamic Panel Models
  4. Panel Data Matching

Mediation and Moderation Analysis

  1. Mediation Analysis
  2. Moderation Analysis
  3. Moderated Mediation Analysis

Sensitivity Analysis and Robustness Checks

  1. Sensitivity Analysis
  2. Bounds Analysis
  3. Placebo Tests
  4. Permutation Tests

Advanced Reweighting and Balancing Methods

  1. Inverse Probability of Treatment Weighting (IPTW)
  2. Standardized Mortality Ratio Weighting (SMRW)
  3. Calibration Weighting

Instrumental Variable Extensions

  1. Two-Stage Least Squares (2SLS)
  2. Generalized Method of Moments (GMM)
  3. Limited Information Maximum Likelihood (LIML)
  4. Control Function Approach

Subgroup Analysis and Heterogeneity

  1. Subgroup Analysis
  2. Quantile Treatment Effects (QTE)
  3. Heterogeneous Treatment Effects Analysis

Other Methods

  1. Matching with Multiple Controls
  2. Propensity Score Stratification
  3. Propensity Score Regression Adjustment
  4. Cross-Over Designs
  5. Regression Kink Design
  6. Fuzzy Regression Discontinuity Design
  7. Sharp Regression Discontinuity Design

Emerging and Hybrid Methods

  1. Network Causal Inference
  2. Spatial Causal Inference
  3. Integrative Causal Inference (combining multiple methods)

Causal Inference Software Tools

  1. Epidemiological Software (references: DAGitty)
  2. Statistical Packages (references: R's MatchIt, twang for IPW, Zelig, Python's causalml, DoWhy)
  3. Machine Learning Libraries (references: econML, causalForest in R)

Development Plan

The development plan involves implementing each method step-by-step, providing detailed documentation and examples for each. Here is a proposed plan:

  1. Initial Setup

    • Set up the repository structure.
    • Define coding standards and guidelines.
  2. Phase 1: Basic Methods

    • Implement and document basic methods such as RCTs, IV, and DiD.
    • Provide examples and use cases for each method.
  3. Phase 2: Intermediate Methods

    • Implement matching and reweighting methods.
    • Include detailed documentation and examples.
  4. Phase 3: Advanced Methods

    • Implement graphical and structural methods, machine learning methods, and time-series methods.
    • Provide complex examples and case studies.
  5. Phase 4: Robustness and Sensitivity Analysis

    • Implement sensitivity analysis and robustness checks.
    • Document common pitfalls and how to address them.
  6. Phase 5: Emerging Methods

    • Implement and document emerging and hybrid methods.
    • Include innovative applications and case studies.
  7. Final Phase: Integration and Testing

    • Integrate all methods into a cohesive framework.
    • Conduct comprehensive testing and validation.
    • Prepare final documentation and examples.

Contributing

We welcome contributions from the community. Please follow our contributing guidelines to get started.

License

This project is licensed under the MIT License - see the LICENSE file for details.