CausalDatasets

Existing Benchmarks

Repository: CANDLE on GitHub
Paper Link: CANDLE: Causal ANalysis in DisentangLed rEpresentations
Description: Image-based dataset that includes known causal generative factors as well as confounders to help study and improve deep generative latent variable models from a causal perspective. The dataset aims to enable both unsupervised and supervised learning methods to study disentangled representation learning. CANDLE includes images with controlled variations in objects, colors, sizes, rotations, lighting, and scenes. Each image's metadata is provided in JSON format, detailing the scene, lighting, object types, colors, sizes, and rotations.

Repository: Causal3DIdent on GitHub
Paper Link: Causal3DIdent Paper
Description: The Causal3DIdent dataset is an advanced dataset designed to aid in the study of causal representation learning, particularly within 3D visual environments. This dataset expands upon the earlier 3DIdent dataset by including six additional object classes (Hare, Dragon, Cow, Armadillo, Horse, and Head) and imposing a causal graph over the latent variables. This setup allows for a deeper analysis of causal relationships within the data.

Repository: CausalCircuit on GitHub
Paper Link: CausalCircuit Paper
Description: The dataset consists of images that show a robotic arm interacting with a system of buttons and lights. In this system, there are four causal variables describing the position of the robot arm along an arc as well as the intensities of red, green, and blue lights. The data are rendered as 512x512 images with MuJoCO, an open-source physics engine.

Repository: CausalChaos on GitHub
Paper Link: CausalChaos Paper
Description: A novel, challenging causal Why-QA dataset built upon the iconic “Tom and Jerry" cartoon series. Consists of questions, answers and explanations.

Repository: MultiModelCausalIdent on GitHub
Paper Link: MultiModelCausalIdent Paper
Description: An extension of Causal3DIdent dataset, with textual descriptions for each image.

Repository: CausalTriplet on GitHub
Paper Link: CausalTriplet Paper
Description: Each instance in the dataset comprises a pair of images captured before and after a high-level action. The dataset features high visual complexity, actionable counterfactual observations, and downstream interventional reasoning, all of which are essential for causal representation learning in real-world contexts.