Python code to reproduce all plots in:
❝Bayesian optimization of nanoporous materials❞ A. Deshwal, C. Simon, J. R. Doppa. ChemRxiv. (2021) DOI
the Python 3 libraries required for the project are in requirements.txt
. use Jupyter Notebook or Jupyter Lab to run Python 3 in the *.ipynb
.
our paper relies on data from Mercado et al. here. we visited Materials Cloud to download and untar properties.tgz
giving properties.csv
in new/
. this is the data we use.
run the code in the Jupyter Notebook prepare_Xy.ipynb
to prepare the data and write inputs_and_outputs.pkl
to be read in by other Notebooks. in here, you can set the number of runs nb_runs
, number of iterations for each run nb_iterations
, and, if you wish, a flag downsample_data
for testing.
run the following Jupyter Notebooks, which will write search results to .pkl
files.
random_search.ipynb
for random searchevol_search.ipynb
for evolutionary search (CMA-ES)random_forest_run.ipynb
for one-shot supervised machine learning (via random forests). run twice, one with the flagdiversify_training = True
, the other withdiversify_training = False
.BO_run.ipynb
for Bayesian optimization. run three times, withwhich_acquisition
set to"EI"
,"max y_hat"
, andmax sigma
.
each .ipynb
can be run on a desktop computer. the BO code takes the longest, at ~10 min per run.
finally, run viz.ipynb
to read in the *.pkl
files output from the search runs and visualize the results.
see synthetic_example.ipynb
for the toy GP plots in the paper.