These instructions have been tested on linux (last updated: 17th June 2022)
Originally accepted at AAAI-21. The tags [cwidth] and [expert] refer to the two follow-up versions accepted at NeurIPS-21 and UAI-22 resp. See Results section for publication details.
Java SDK
(required to run BLIP)Maven
(required to build BLIP)
Python 3.6
or higher (required to run BN-SLIM)C++
(required to run Merlin) [cwidth]Make
(required to build Merlin and UWrMaxSat)git
(required to obtain ETL source code)Python 2.7
(required to run ETL)
Note: Before you start, move all python (.py) files into the
slim
directory: (See Section: Directory Structure)
-
Download and unzip source from MaxSAT Evaluation 2019
pushd solvers wget https://maxsat-evaluations.github.io/2019/mse19-solver-src/complete/UWrMaxSat-1.0.zip unzip UWrMaxSat-1.0.zip
-
Apply patch
uwrmaxsat.patch
to fix minor bugpushd UWrMaxSat-1.0/code patch -ub uwrmaxsat/MsSolver.cc -i ../../uwrmaxsat.patch
-
Build UWrMaxSat
chmod 777 starexec_build ./starexec_build popd
-
Place built binary in
solvers
directorycp UWrMaxSat-1.0/bin/uwrmaxsat . -v popd
-
(Optional) Test
solvers/uwrmaxsat -m test/test.cnf -v1 -cpu-lim=2
-
Download and unzip source from BLIP
pushd solvers wget https://ipg.idsia.ch/upfiles/blip-publish.zip --no-check-certificate unzip blip-publish.zip
-
Apply patch
blip.patch
pushd blip-publish patch -ub -p1 -i ../blip.patch
-
Compile BLIP [cwidth] and move jar to
solvers/blip.jar
./compile-blip.sh cp blip.jar ../blip.jar -v
-
Undo patch
blip.patch
and applyblip-con.patch
patch -u -p1 -i ../blip.patch --reverse patch -ub -p1 -i ../blip-con.patch
-
Compile BLIP [expert] and move jar to
solvers/blip-con.jar
./compile-blip.sh cp blip.jar ../blip-con.jar -v popd popd
-
(Optional) Tests
- [cwidth] check if
java -jar solvers/blip.jar solver.kg.adv
contains-cw
- [expert] check if
java -jar solvers/blip.jar solver.kg.adv
contains-con
- Check if
tmp.res
containselim-order
field after running
java -jar solvers/blip.jar solver.kmax -j test/test.jkl -r tmp.res -w 5 -v 1
- [cwidth] check if
-
Download and unzip source from Merlin
pushd solvers wget https://github.com/radum2275/merlin/tree/008d910307a2043f9446676f28010079ea49ec15 --no-check-certificate -O merlin-src.zip unzip merlin-src.zip
-
Apply patch
merlin.patch
to enable elim-order outputpushd merlin-src patch -ub -p1 -i ../merlin.patch
-
Compile Merlin
make popd
-
Place executable in
solvers
directorycp merlin-src/bin/merlin . -v popd
-
(Optional) Test and check if
tmp.PR.json
containsvalue
of-7.730569
after runningsolvers/merlin -t PR -a wmb -M lex -o tmp -O json -i 4 -f test/test.uai -e test/test.evid
-
Download and unzip source from et-learn or clone the git repo
pushd solvers git clone https://github.com/marcobb8/et-learn.git et_learn
-
Follow instructions for installing
et-learn
from readme
Note:
et-learn
requires Python 2.7 while BN-SLIM requires Python 3.6+ and these are not mutually compatible. Hence it is recommended to use virtual environments (also refer to this) and keep the two softwares separate.
-
Copy driver file
hcet.py
to same directorymv ../hcet.py et_learn
-
(Optional) Test and check if
tmp.res
is generated after runningcd solvers/et_learn python hcet.py ../../test/test.data 5 -p -o tmp
The datasets are available in the datasets
folder, which contains subfolders
dat
, jkl
and con
(see Section on File formats for more details).
The Experimental results are contained in the experiments
folder.
In addition to some plots, the raw data from the experiments are provided as
.csv
files.
To install requirements:
pip install -r requirements.txt
To install optional requirements that allow drawing graphs etc.:
apt install graphviz
pip install -r optional_requirements.txt
The directory structure should now be as follows (.zip
files hidden for brevity,
*
denotes the generated files)
bnslim
├── README.md
├── setup-uwrmaxsat.sh
├── setup-blip.sh
├── setup-merlin.sh
├── download-etlearn.sh
├── test
│ ├── test.cnf
│ ├── test.jkl
│ ├── test.dat
│ ├── test.uai
│ ├── test.evid
│ ├── test.data
│ ├── demo.res
| └── demo.net
├── requirements.txt
├── optional_requirements.txt
├── solvers
│ ├── uwrmaxsat.patch
│ ├── blip.patch
│ ├── merlin.patch
│ ├── UWrMaxSat-1.0*
│ ├── uwrmaxsat*
│ ├── blip-publish*
| ├── blip.jar*
│ ├── merlin-src*
│ ├── merlin*
| └── et_learn*
| ├── hcet.py
│ └── ...
├── datasets
| ├── dat/
| ├── jkl/
| └── con/
├── experiments
| ├── cwidth/
| └── expert/
└── slim
├── blip.py
├── bnio.py
├── constrained_encoding.py
├── complexity_encoding.py
├── berg_encoding.py
├── eval_model.py
├── generate_constraints.py
├── samer_veith.py
├── slim.py
├── utils.py
└── verify_cwidth.py
cd slim
# bounded treewidth
python slim.py ../test/test.jkl 5 -v -u kmax -t 60
# bounded msss [cwidth]
python slim.py ../test/test.jkl 0 -v -u kmax -t 60 -b 0 -d ../test/test.dat --start-with ../test/demo.res -w --feasible-cw --feasible-cw-threshold 108
# constraints + bounded treewidth [expert]
# - first generate starting network through blip-con.jar
java -jar ../solvers/blip-con.jar solver.kg.adv -v 1 -con ../test/test.con -j ../test/test.jkl -w 5 -r ../test/demo2.res -t 60 -seed 1
# - then run slim on it
python slim.py ../test/test.jkl 5 -v -u kg.adv -t 60 -b 10 -s 5 --constraint-file ../test/test.con --start-with ../test/demo2.res -r 1
Run
python slim.py --help
for description of available options
For larger networks it is recommended to invoke as
python -O slim.py ...
Since most of the file formats used/required for this software and its auxiliary helper softwares to run aren't standard, we provide a brief description of some of these file formats.
A .res
file contains the DAG expressed by means of the parent sets
as well as a field indicating the score of the DAG and optionally
the elimination ordering used to bound the treewidth of the DAG.
See
test/demo.res
for an example.
The patch applied to BLIP allows it to output .res
files with the
elimination ordering which isn't supported by it out-of-the-box.
Additionally, it adds support for working with maximum state space
size instead of treewidth.
[cwidth] The patch applied to Merlin allows us to customize the elimination ordering
technique/heuristic of the variables via the -M
command line argument.
In our use-case, we pre-set the ordering in the uai file and supply -M lex
to invoke the lexicographic ordering, thereby preserving and respecting
our pre-supplied ordering.
These files contain the samples drawn from the Bayesian Networks. This isn't a standard format and hence there exist discrepancies in the way different programs and applications treat them. In general, these are space-separated (or comma-separated) files with or without the header. The header refers to two lines at the beginning of the file. The first line of the header contains the names of the random variables (space-separated). The second line contains the domain sizes of the variables in the same order.
A .jkl
file contains the pre-computed parent set score function caches.
Usually these are computed via the blip.jar scorer.is ...
.
These are somewhat standard file formats for storing Bayesian Networks.
The .uai
format is required by Merlin and the .net
format is used
as an intermediate format to go from .res -> .uai
.
File format for supplying evidence to Merlin.
The provided python script hcet.py
allows one to obtain the solution of the
ETL algorithm(s) in the form of a compliant .res
file which can be supplied
to slim.py
through the --start-with
option.
Run python et_learn/hcet.py --help
for more details.
Note: Since the DAG computed by ETL could contain parent sets that are absent from the pre-computed score function caches in datasets/jkl, one might need to "merge" the jkl files output when running
hcet.py
with the one from dataset/jkl to obtain a new jkl file which can then be used as input for BN-SLIM
Non-standard file format for representing expert constraints. It contains
one constraint per line in the format <type> <from> <to>
, where
<from>
and <to>
are the endpoints of the constraint of type <type>
.
The supported constraint types are: positive/negative/undirected arc
(denoting direct dependence) and positive/negative ancestry denoting
(indirect dependence).
Turbocharging Treewidth-Bounded Bayesian Network Structure Learning
Vaidyanathan Peruvemba Ramaswamy, Stefan Szeider
Proceeding of AAAI-21, the Thirty-Fifth AAAI Conference on Artificial Intelligence, pages 3895–3903, 2021, AAAI Press.)
[bibtex] [url] [pdf]
∆BIC | tw 2 | tw 5 | tw 8 |
---|---|---|---|
extremely positive | 89 | 77 | 75 |
strongly positive | 3 | 2 | 4 |
positive | 0 | 4 | 3 |
neutral | 5 | 14 | 15 |
∆BIC | tw 2 | tw 5 | tw 8 |
---|---|---|---|
extremely positive | 91 | 80 | 83 |
strongly positive | 0 | 2 | 5 |
positive | 0 | 1 | 1 |
neutral | 6 | 14 | 8 |
∆BIC | tw 2 | tw 5 | tw 8 |
---|---|---|---|
extremely positive | 94 | 74 | 79 |
strongly positive | 0 | 1 | 0 |
positive | 0 | 1 | 1 |
neutral | 0 | 6 | 8 |
negative | 0 | 0 | 0 |
strongly negative | 0 | 1 | 0 |
extremely negative | 0 | 11 | 6 |
Above tables obtained by running with the following configuration:
python -O slim.py <dataset> <treewidth> -r<seed> -t1800 -u kmax -vx --start-with=<sol.res>
Where sol.res
is the initial heuristic solution which was computed before-hand
and supplied using the --start-with
option.
If that doesn't work, try running with a aaai
tagged commit.
Learning fast-inference Bayesian networks
Vaidyanathan Peruvemba Ramaswamy, Stefan Szeider
Proceedings of NeurIPS 2021, the Thirty-fifth Conference on Neural Information Processing Systems (M. Ranzato, A. Beygelzimer, K. Nguyen, P.S. Liang, J.W. Vaughan, Y. Dauphin, eds.), pages 17852–17863, 2021.
[bibtex] [url] [pdf]
experiments/cwidth/baseline_data.csv
and experiments/cwidth/heuristic_data.csv
contain all the data for the 16 datasets from datasets.zip
,
for multiple treewidths, msss bounds and random seeds
run with the following configuration:
python -O slim.py <dataset> <treewidth> -t5400 -u <heuristic> -d <datfile.dat> --start-with <heur_sol.res> -w --feasible-cw --feasible-cw-threshold <msssbound> -r<seed> --budget 0 -v"
Where heur_sol.res
is the initial heuristic solution which was computed before-hand
and supplied using the --start-with
option, and datfile.dat
is the datfile corresponding
to the supplied dataset.
Learning Large Bayesian Networks with Expert Constraints
Vaidyanathan Peruvemba Ramaswamy, Stefan Szeider
38th Conference on Uncertainty in Artificial Intelligence (UAI 2022), Endhoven, Netherlands, August 1–5, 2022 (James Cussens, Kun Zhang, eds.), 2022.
[bibtex] [url] [pdf]
∆BIC | count |
---|---|
extremely positive | 127 |
strongly positive | 0 |
positive | 0 |
neutral | 14 |
negative | 1 |
strongly negative | 0 |
extremely negative | 7 |
The data from the experimental evaluation for the paper are available as
<method>_data.csv
files for the scores and <method>_activity.csv
files
for the improvement activity within the experiments/expert
folder.
The script generate_constraints.py
was used to randomly select a subset
of constraints (an equal number for each type) from the ground truth networks.