Implementation of F. R. Guo, E. Perković (2022). Efficient Least Squares for Estimating Total Effects under Linearity and Causal Sufficiency.
This is a part of my Ph.D. Preliminary Examination at the Department of Statistics at University of Washington.
All simulations were performed in R v4.1.2 using pcalg
v2.7.5. Standard graphical libraries are required along with tidyverse
and ggplot2
.
install.packages("pcalg")
install.packages("igraph")
install.packages("RBGL")
install.packages("tidyverse")
install.packages("ggplot2")
install.packages("latex2exp")
install.packages("ggh4x")
install.packages("optparse")
Code\bucket_decomposition.R
performs partial causal ordering.Code\cpdag.R
is my own implementation of coverting a DAG into a CPDAG since the clusters owned by the Department of Statistics at UW had issues runningpcalg:::dag2cpdag
.Code\identifiability.R
contains implementation of finding ancestors, descendants, directed paths, etc. Most importantly, it checks for the causal effect identifiability condition.Code\mle.R
computes the G-regression estimator.Code\sampler.R
samples random DAGs, treatment sets and outcome variable, and data points.
All simulations are seeded. So they should be reproducible.
$ cd Code
$ Rscript run_simulations.R --num_nodes 10 --treatment_size 2 --n 1000 --num_reps 5 --output ../Results/test.rds
The above command uses the true CPDAG to estimate causal effects. To replicate results in Appendix G where the CPDAG is estimated using Greedy Equivalence Search, replace run_simulations.R
with run_simulations_unknown_cpdag.R
.
Arguments:
--num_nodes
- Number of nodes in the DAG--treatment_size
- Number of variables in the treatment set--n
- Number of iid data points--num_reps
- Number of replications--output
- File to store output
The cluster jobs in Code\Slurm\synthetic_sims\
can automate replicating results in the report. The cluster jobs in Code\Slurm\synthetic_sims_ges\
can help replicate results in Appendix G. Plots are generated using Code\results.R
. You might need to modify filenamse to accomodate your results.
The dataset was obtained from https://www.synapse.org/#!Synapse:syn3049712/wiki/74630. This is locally available in the Data\
directory. Table 3 in the report can be generated using the following commands:
$ cd Code
$ Rscript dream4_results.R
$ Rscript dream4_bootstrap.R
dream4_results.R
generates the normalized squared errors. dream4_bootstrap.R
computes standard errors using a bootstrap approach described in Appendix E.