Does botorch optimizer in mo setting such as qehvi, qnehvi or qnparego support hierarchical space?

Question

Does botorch optimizer in mo setting such as qehvi, qnehvi or qnparego support hierarchical space?

ayushi-3536 opened this issue 2 years ago · 4 comments

Answer 1 · 2022-08-05T15:25:11.000Z

Yes, Ax can handle using hierarchical search spaces in multi objective optimization. Ax handles this by flattening down the space during the transformations that take place before the values reach botorch. This functionality is rather new and not fully supported yet so you may run into some difficulty, but we're happy to help you out as best we can.

Since this functionality is so new there is no tutorial on our website, but I've thrown together this little snippet that should help you get started on your own.

from ax.service.ax_client import AxClient
from ax.service.utils.instantiation import ObjectiveProperties

ax_client = AxClient()
ax_client.create_experiment(
    name="moo_experiment",
    parameters=[
        {
            "name": "model",
            "type": "choice",
            "values": ['Linear', 'XGBoost'],
            "dependents": {'Linear': ['learning_rate', 'l2_reg_weight'], 'XGBoost': ['num_boost_rounds']}
        },
        {
            "name": "learning_rate",
            "type": "range",
            "bounds": [0.001, 0.1]
        },
        {
            "name": "l2_reg_weight",
            "type": "range",
            "bounds": [1e-05, 0.001]
        },
        {
            "name": "num_boost_rounds",
            "type": "range",
            "value_type": "int",
            "bounds": [10, 20]
        },
    ],
    objectives={
        "a": ObjectiveProperties(minimize=False), 
        "b": ObjectiveProperties(minimize=False)
    },
)

Answer 2 · 2022-08-05T16:00:05.000Z

Thanks for answering the question. does that imply that if I use a flatten down search space already(flatten down it on my end) it would have no impact on performance anyway as botorch would do the same?
For eg: I have a config space as below:
Num_layer : range (lower=1, higher=4) #number of layer
fc_layer_1 : range (lower=16, higher = 256) #layer1 size
fc_layer_2 : range (lower=16, higher = 256) #layer2 size
fc_layer_3 : range (lower=16, higher = 256) #layer3 size
fc_layer_4 : range (lower=16, higher = 256) #layer4 size

for optimizer I will pass all params(num_layer, fc_layer1, fc_layer2, fc_layer2, fc_layer4) but while sampling a model I will use conditional parameter(num_layer) to sample the correct model and evaluation

Would this strategy would differ in any way if I use hierarchical search space?

Answer 3 · 2022-08-06T00:36:50.000Z

Would this strategy would differ in any way if I use hierarchical search space?

Currently that would be equivalent to what we're doing internally. We have in the past explore other approaches (such as https://arxiv.org/abs/2006.11771) but those were kind of finicky to get to work reliably across problems so we have not integrated them into Ax yet. That said, we are actively working on internal applications that involve large hierarchical search spaces, so we may integrate other approaches that take the structure in the model into account in the future.

Answer 4 · 2022-09-13T16:15:47.000Z

I'll put this on our wishlist as "support for MOO for hierarchical search spaces without flattening the search space".