Source code for "Incorporating Proportional Sparse Penalty for Causal Structure Learning"
And the
Reimplementation of "A Graph Autoencoder Approach to Causal Structure Learning" (https://arxiv.org/abs/1911.07420)
CUDA Version 9.0.176
CUDNN Version 7.6.5
Python 3.6.13
loguru==0.5.3
matplotlib==3.3.4
networkx==2.5.1
numpy
scipy
pandas==1.1.5
Pillow==8.2.0
PyYAML==5.4.1
tensorboard==2.5.0
torch==1.9.0
yapf==0.31.0
- Run GAE-PSP
Run with
python main.py --n=3000 --d=100 --graph_type=erdos-renyi --degree=3 --sem_type=gauss --dataset_type=nonlinear_3 --x_dim=1 --hidden_size=16 --latent_dim=1 --lambda_sparsity=0.01 --layer=3 --learning_rate=1e-3 --max_iters=20 --min_iters=5 --epochs=300 --init_rho=1.0 --rho_thres=1e18 --beta=10.0 --gamma=0.25 --min_h=1e-12 --max_h=1e-7 --mse_thres=1.15 --seed_data=0 --seed_model=0 --graph_thres=0.2 --base_dir=runs --cuda=-1 --psp --log_level=INFO
- Run GAE
Run with
python main.py --n=3000 --d=100 --graph_type=erdos-renyi --degree=3 --sem_type=gauss --dataset_type=nonlinear_3 --x_dim=1 --hidden_size=16 --latent_dim=1 --lambda_sparsity=1.0 --layer=3 --learning_rate=1e-3 --max_iters=20 --min_iters=5 --epochs=300 --init_rho=1.0 --rho_thres=1e18 --beta=10.0 --gamma=0.25 --min_h=1e-12 --max_h=1e-7 --mse_thres=1.15 --seed_data=0 --seed_model=0 --graph_thres=0.2 --base_dir=runs --cuda=-1 --log_level=INFO
This task will run multiple experiments with varying parameters (seed for dataset and training) distributed dynamically to the pre-defined GPU list.
Firstly, set parameters gpus
, seed_data_range
and seed_model_range
in pool.py
. Here is an example:
# Available GPU list
gpus: list = [0, 1, 2, 3, 4, 5, 6, 7] # up to number of devices, add -1 means add CPU
# Number of tasks running simultaneously on a GPU/CPU for fully utilizing computing power
n_task_for_each: int = 3
# Parameter arrangement, scalar value means only one experiment, while list of values means multiple experiment for different values
params = {}
params["--n"] = 3000 # number of samples
params["--d"] = [10, 20, 50, 100] # experiment with different number of variables
params["--graph_type"] = "erdos-renyi"
params["--degree"] = 3
params["--sem_type"] = "gauss" # ["gauss", "mnonr"]
params["--dataset_type"] = "nonlinear_3"
params["--x_dim"] = 1
params["--hidden_size"] = 16
params["--layer"] = 3
params["--latent_dim"] = 1
params["--lambda_sparsity"] = 0.01 # l1-regulartion (0.01, 1.0)
params["--psp"] = True # (True, False)
params["--learning_rate"] = 3e-4 # note: 3e-4 is the best
params["--max_iters"] = 20
params["--min_iters"] = 5
params["--epochs"] = 300
params["--init_rho"] = 1.0
params["--rho_thres"] = 1e18
params["--beta"] = 10.0
params["--gamma"] = 0.25
params["--min_h"] = 1e-12
params["--max_h"] = 1e-7
params["--early_stopping"] = True
params["--mse_thres"] = 1.15
params["--seed_data"] = [i for i in range(10)] # experiments for 10 dataset with different seeds
params["--seed_model"] = [i for i in range(10)] # experiments for 10 dataset with different model initializations
params["--graph_thres"] = 0.20 # default 0.2
params["--base_dir"] = "runs" # which to store results
params["--log_level"] = "DEBUG" # if print and store intermediate outputs
Then, run command as follows:
python pool.py
Simply use thresholding tool for evaluating results, after specifying directory
in tune_graph_threshold.py
, run this:
cd tools
python tune_graph_threshold.py