/awesome-causality-data

A data index for learning causality.

MIT LicenseMIT

awesome-causality-data

An index of datasets that can be used for learning causality.

Please cite our survey if this data index helps your research.

@article{guo2018survey,
  title={A Survey of Learning Causality with Data: Problems and Methods},
  author={Guo, Ruocheng and Cheng, Lu and Li, Jundong and Hahn, P. Richard and Liu, Huan}, 
  journal={arXiv preprint arXiv:1809.09337}, 
  year={2018}
}

Updates coming soon

Datasets for Learning Causal Effects (Causal Inference)

Causal Effect Estimation with Single Cause

Datasets with i.i.d. samples

Standard datasets for learning causal effects comes with each instance in the format of (x,d,y).

IHDP1

How is IHDP1 (setting A) simulated

IHDP2

Twins

Job Training (Lalonde 1986 in the R package qte)

ACIC Benchmark

News

TCGA

Datasets with non-i.i.d. samples (with interference, spillover effect or auxiliary network information)

Amazon

Datasets with instrumental Variables (IV)

Standard datasets for learning causal effects, each instance has the format of (i,x,d,y).

1980 Census Extract

CPS Extract

Datasets for Regression Discontinuity Design

Population Threshold RDD Datasets

Datasets with Multiple Causes

Datasets for Learning Causal Relationships (Causal Discovery)

Distinguishing Cause from Effect

Database with cause-effect pairs (Tbingen Cause-Effect Pairs)

AntiCD3/CD28

Pittsburgh Bridges

Abalone

Causal Bayesian Network

Lung Cancer Simple Set (LUCAS)

Datasets for Connections to Machine Learning

Datasets with randomized test set for recommendation systems

Name Paper URL
Coat Schnabel, Tobias, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. "Recommendations as treatments: Debiasing learning and evaluation." arXiv preprint arXiv:1602.05352 (2016). download
Yahoo! R3 Schnabel, Tobias, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. "Recommendations as treatments: Debiasing learning and evaluation." arXiv preprint arXiv:1602.05352 (2016). download
Spotify Music Streaming Sessions Brost, Brian, Rishabh Mehrotra, and Tristan Jehan. "The Music Streaming Sessions Dataset." In The World Wide Web Conference, pp. 2594-2600. ACM, 2019. download