- Cornell-MOE is built on MOE, which was open sourced by Yelp.
- We extend the batch expected improvement (q-EI) to the setting where derivative information is available (d-EI, Wu et al, 2017).
- We implement batch knowledge gradient with (d-KG, Wu et al, 2017) and without (q-KG, Wu and Frazier, 2016) derivative information.
- We implement the Bayesian treatment of hyperparamters in GP regression, which makes our batch Bayesian optimization algorithms more robust.
- We provide several examples of optimizing synthetic and real-world functions using q-KG and d-KG in the folder 'examples'. More examples are coming.
- The project is under active development. We are revising comments in the code, and an update will be ready soon. Bug reports and issues are welcome!
Below we show two demos:
The left-hand side shows the fitted statistical model and the points suggested by Cornell-MOE. Note that the function evaluation is subject to noise; the right-hand side visualizes the acquisition function according to q-KG criteria.
Cornell-MOE implements a library of batch Bayesian optimization algorithms. It works by iteratively:
- Fitting a Gaussian Process (GP) with historical data
- Sampling the hyperparameters of the Gaussian Process via MCMC
- Finding the set of points to sample next with highest gain, by batch Expected Improvement or batch knowledge gradient or derivative-enabled knowledge gradient or continuous-fidelity knowledge gradient (cf-KG)
- Returning the points to sample
Externally you can use Cornell-MOE through the the Python interface. Please refer to the examples in the file main.py in the folder 'examples'.
We recommend install from source (please see Install Documentation for details). We have tested the package on both Ubuntu and CentOS operating systems. Below we provide a step-by-step instruction to install Cornell-MOE on a AWS EC2 with Ubuntu operating system.
step 1, install requires: python 2.6.7+, gcc 4.7.3+, cmake 2.8.9+, boost 1.51+, pip 1.2.1+, doxygen 1.8.5+
$ sudo apt-get update
$ sudo apt-get install python python-dev gcc cmake libboost-all-dev python-pip doxygen libblas-dev liblapack-dev gfortran git python-numpy python-scipy
$ pip install virtualenv
$ virtualenv --no-site-packages ENV_NAME
step 3, set the correct environment variables for compiling the cpp code. One need to create a script with the content as follows, then source it.
export MOE_CC_PATH=/path/to/your/gcc && export MOE_CXX_PATH=/path/to/your/g++
export MOE_CMAKE_OPTS="-D MOE_PYTHON_INCLUDE_DIR=/path/to/where/Python.h/is/found -D MOE_PYTHON_LIBRARY=/path/to/python/shared/library/object"
For example, the script that we use on a AWS EC2 with Ubuntu OS is as follows
#!/bin/bash
export MOE_CC_PATH=/usr/bin/gcc
export MOE_CXX_PATH=/usr/bin/g++
export MOE_CMAKE_OPTS="-D MOE_PYTHON_INCLUDE_DIR=/usr/include/python2.7 -D MOE_PYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0"
$ source ENV_NAME/bin/activate
$ git clone https://github.com/wujian16/Cornell-MOE.git
$ cd Cornell-MOE
$ pip install -r requirements.txt
$ python setup.py install
See the examples in the folder 'examples'. One can run the main.py following the instruction there. The black-box functions that we would like to optimize are defined in synthetic_functions.py and real_functions.py. One can also define their own functions there.
See Wu and Frazier, 2016. We define four synthetic functions: Branin, Rosenbrock, Hartmann3 and Hartmann6, and one real-world function: CIFRA10 (tuning a convolutional neural network on CIFAR-10). One can run main.py by the following command with proper options.
# python main.py [obj_func_name] [num_to_sample] [job_id]
# q = num_to_sample
$ python main.py Hartmann3 4 1
See Wu et al, 2017. We provide a large-scale kernel learning example: KISSGP class defined in real_functions.py. One note that there is a line self._observations = numpy.arange(self._dim)
in
class KISSGP(object):
def __init__(self):
self._dim = 3
self._search_domain = numpy.array([[-1, 3], [-1, 3], [-1, 3]])
self._num_init_pts = 1
self._sample_var = 0.0
self._min_value = 0.0
self._observations = numpy.arange(self._dim)
self._num_fidelity = 0
which means that we access the first 3 partial derivatives. One can run this benchmark similarly by
$ python main.py KISSGP 4 1
If one modifies to self._observations = []
, and then rerun the command above, it will execute the q-KG algorithm without exploiting gradient
observations. The comparison between q-KG and d-KG on 10 independent runs are as follows,
coming soon
If you find the code useful, please kindly cite our papers Wu and Frazier, 2016 and Wu et al, 2017.
@inproceedings{wu2016parallel,
title={The parallel knowledge gradient method for batch bayesian optimization},
author={Wu, Jian and Frazier, Peter},
booktitle={Advances in Neural Information Processing Systems},
pages={3126--3134},
year={2016}
}
@inproceedings{wu2017bayesian,
title={Bayesian Optimization with Gradients},
author={Wu, Jian and Poloczek, Matthias and Wilson, Andrew Gordon and Frazier, Peter I},
booktitle={Advances in Neural Information Processing Systems},
note={Accepted for publication},
year={2017}
}
See Contributing Documentation
Cornell-MOE is licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0