facebook/Ax

Multi-fidelity optimization with KG and service API

soerenjalas opened this issue · 7 comments

Hi,
First off, thanks for the great work! I've been trying to run multi-fidelity optimization using the one-shot-KG method from botorch with the service API. In principle, everything seems to run fine. However, I noticed that the optimizer keeps running in the lowest fidelity and does not explore in this parameter.
To see I've this is an issue with my problem I tried to reproduce the MFKG example from the botorch documentation where the fidelity is clearly changed during the optimization, but I observe the same behaviour.
This is the code I ran to reproduce the botorch example.

from ax.service.ax_client import AxClient
from botorch.test_functions.multi_fidelity import AugmentedHartmann
from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
from ax.modelbridge.registry import Models
import torch

problem = AugmentedHartmann(negate=True)
def objective(parameters):
    # x7 is the fidelity
    x = torch.tensor([parameters.get(f"x{i+1}") for i in range(7)])
    return {'f':(problem(x),0.0)}

gs = GenerationStrategy(
steps=[GenerationStep(model=Models.SOBOL,num_trials = 16),
GenerationStep(model=Models.GPKG,num_trials=-1),
])


ax_client = AxClient(generation_strategy=gs)
ax_client.create_experiment(
    name="hartmann_mf_experiment",
    parameters=[
        {
            "name": "x1",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x2",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x3",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x4",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x5",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x6",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x7",
            "type": "range",
            "bounds": [0.0, 1.0],
            "is_fidelity": True,
            "target_value": 1.
        },
    ],
    objective_name="f",
)
# Initial sobol samples
for i in range(16):
    parameters, trial_index = ax_client.get_next_trial()
    ax_client.complete_trial(trial_index=trial_index, raw_data=objective(parameters))

# KGBO
for i in range(6):
    q_p, q_t = [], []
    # Simulate batches
    for q in range(4):
        parameters, trial_index = ax_client.get_next_trial()
        q_p.append(parameters)
        q_t.append(trial_index)
    for q in range(4):
        pi = q_p[q]
        ti = q_t[q]
        ax_client.complete_trial(trial_index=ti, raw_data=objective(pi))

Unknown-2

After the initial samples the fidelity stays at 0 except for two trial where it is very close to zero.

Are there any parameters that need to be specified for the acquisition function or is MFKG not yet supported with the service API?

Thanks!

Hi @soerenjalas! Thanks for the great repro example; let me look into this.

Hi,
to follow up on this: I think I managed to get it working more like I would expect. The behaviour seems to be related to the cost_intercept value of the KG object:

GenerationStep(model=Models.GPKG,num_trials=-1,
               model_kwargs={'cost_intercept':5}, model_gen_kwargs={"num_fantasies":128})

After setting the parameters to match the named botorch example, the fidelity is better explored.
Unknown-4

Hi @soerenjala, that's a great question. So what's happening here is that the multi-fidelity KG acquisition function optimizes an objective of the form information_gain / cost. When the cost model is improperly specified (particularly if the cost intercept is too low), then arms with cost near 0 can be heavily favored. It seems like in this case, the default cost specification was incompatible with the objective model, which caused the strange behavior. After changing the fixed cost to be 5.0, this lowered the range of possible costs across the design space, which caused the algorithm to explore in a more intuitive manner.

For your application, do you have a cost function in mind? A more custom cost can be used by specifying a different cost_aware_utility (see https://github.com/pytorch/botorch/blob/master/botorch/acquisition/cost_aware.py). This is used here (https://github.com/facebook/Ax/blob/master/ax/models/torch/botorch_kg.py#L323-L326) in Ax, but by default, the Ax model uses a linear cost in the denominator. I'd suggest first trying out different cost_intercepts and pick one that is appropriate for your application, and if that doesn't work well, then you might consider using a custom cost aware utility.

@lena-kashtelyan Any thoughts on a multi-fidelity Ax tutorial? (e.g. based on what's in this thread). Not sure if it's worth it compared to other priorities, but figured I'd float the idea. Not a huge deal, just a drive-by comment.

Based on the repro above, I get:

Raw data must be data for a single arm for non batched trials.
  File "[C:\Users\sterg\miniconda3\envs\packing\Lib\site-packages\ax\utils\common\typeutils.py]()", line 84, in checked_cast_complex
    check_type("val", val, typ)
  File "[C:\Users\sterg\miniconda3\envs\packing\Lib\site-packages\typeguard\__init__.py]()", line 757, in check_type
    checker_func(argname, value, expected_type, memo)
  File "[C:\Users\sterg\miniconda3\envs\packing\Lib\site-packages\typeguard\__init__.py]()", line 558, in check_union
    raise TypeError('type of {} must be one of ({}); got {} instead'.

During handling of the above exception, another exception occurred:

  File "[C:\Users\sterg\miniconda3\envs\packing\Lib\site-packages\ax\utils\common\typeutils.py]()", line 87, in checked_cast_complex
    raise ValueError(message or f"Value was not of type {typ}: {val}")
  File "[C:\Users\sterg\miniconda3\envs\packing\Lib\site-packages\ax\service\ax_client.py]()", line 1401, in _raw_data_by_arm
    arm_name: checked_cast_complex(
  File "[C:\Users\sterg\miniconda3\envs\packing\Lib\site-packages\ax\service\ax_client.py]()", line 1438, in _make_evaluations_and_data
    raw_data_by_arm = self._raw_data_by_arm(trial=trial, raw_data=raw_data)
  File "[C:\Users\sterg\miniconda3\envs\packing\Lib\site-packages\ax\service\ax_client.py]()", line 1215, in _update_trial_with_raw_data
    evaluations, data = self._make_evaluations_and_data(
  File "[C:\Users\sterg\miniconda3\envs\packing\Lib\site-packages\ax\service\ax_client.py]()", line 598, in complete_trial
    data_update_repr = self._update_trial_with_raw_data(
  File "[C:\Users\sterg\Documents\GitHub\sparks-baird\bayes-opt-particle-packing\examples\multi_fidelity_example.py]()", line 47, in <module>
    ax_client.complete_trial(trial_index=trial_index, raw_data=objective(parameters))

Easy fix. Change:

return {"f": (problem(x), 0.0)}

to:

return {"f": (problem(x).item(), 0.0)}

because it returns a 0d tensor rather than a scalar otherwise.