facebook/Ax

[Feature Request] Support Exact Equality Parameter Constraints

jkterry1 opened this issue · 16 comments

Right now Ax only supports <= or >= as parameter constraints. Most hacks that refactor of my constraint of #153 to make it linear still requires a ==, and in general I believe such constraints are common in LPs and operations research type problems.

It's my understanding support for this is distinct and much simpler from supporting nonlinear constraints, which is why I created a separate issue.

@justinkterry, Just to make sure I understand, what would be the difference between exact equality parameter constraint and a simple fixed parameter?

@lena-kashtelyan Most of the use cases I've seen involving == look something like param_a*parma_b == constant,

Or in the linear case you could have aparam_a+bparam_b == constant. These things all come up in cases where different parameters values consume different portions of a finite assets.

This one is much less challenging to support than general nonlinear constraints since a linear constraint essentially means intersecting the constraint polytope with a lower-dimensional subspace, so the resulting relative interior of the intersection is still a convex polytope in that subspace.

We have some of the necessary components for doing this implemented on the BoTorch end (namely sampling from such spaces). Methodologically there shouldn't be anything else overly challenging, but I suspect it might require quite a bit of engineering work to hook this up properly.

@lena-kashtelyan Most of the use cases I've seen involving == look something like param_a*parma_b == constant,

I want to second it, I tried using an alternative way:

con_1 = SumConstraint(parameters=[salt_1, salt_2, additive_1, additive_2, solvent_1, solvent_2],
is_upper_bound=True, bound=1)

con_2 = SumConstraint(parameters=[salt_1, salt_2, additive_1, additive_2, solvent_1, solvent_2],
is_upper_bound=False, bound=1)

Theoretically, it should work since the inequality is not rigorous. But I got no luck:

raise SearchSpaceExhausted(
ax.exceptions.core.SearchSpaceExhausted: Rejection sampling error (specified maximum draws (10000) exhausted, without finding sufficiently many (1) candidates). This likely means that there are no new points left in the search space.

I interprete it as that the points satisfying the equality are very sparse.

It would be fantastic if the equality constraint could be implemented officially.

Thanks in advance!

I interprete it as that the points satisfying the equality are very sparse.

Yes, this currently uses rejection sampling by default so with an equality constraint the feasible set has zero volume and this will be the outcome.

We have added some features to BoTorch under the hood that allows sampling in different ways and we plan to hook things up in Ax to allow parameter constraints. That will take a bit of work though since it touches storage layers and a lot of other places.

We will now be tracking wishlist items / feature requests in a master issue for improved visibility: #566. Of course please feel free to still open new feature requests issues; we'll take care of thinking them through and adding them to the master issue.

I'm guessing that something like the following still suffers from the issue of small volume with rejection sampling:

con_1 = SumConstraint(parameters=[salt_1, salt_2, additive_1, additive_2, solvent_1, solvent_2],
is_upper_bound=True, bound=1.001)

con_2 = SumConstraint(parameters=[salt_1, salt_2, additive_1, additive_2, solvent_1, solvent_2],
is_upper_bound=False, bound=0.999)

I suppose that you could also relax the bounds significantly and renormalize within the objective function and when returning best_parameters, but I'm not sure if this would cause excessive bloat of the runtime or fundamentally change the nature of the solutions:

0.5 <= sum(parameters) <= 1.5

#727

There was also mention of being able to pass linear equality constraints directly to BoTorch, but IIUC you have to take care of the data transforms yourself. #769 (comment)

If you reparameterize a summation linear equality constraint (e.g. x_1 + x_2 + ... + x_n == 1.0) as a linear inequality constraint (x_1 + x_2 + ... + x{n-1} <= 1.0, where x_n = 1.0 - sum(x_i, 0, n-1)) #727 (comment), the constraints are unchanged but Sobol sampling will be biased towards sampling smaller values from x_n. #727 (comment)

xref: #903

raise SearchSpaceExhausted(
ax.exceptions.core.SearchSpaceExhausted: Rejection sampling error (specified maximum draws (10000) exhausted, without finding sufficiently many (1) candidates). This likely means that there are no new points left in the search space.

It might be possible to address this by using fallback_to_sample_polytope #694 (comment), but hard to know if it would run without errors until after an initial attempt.

There was also mention of being able to pass linear equality constraints directly to BoTorch, but IIUC you have to take care of the data transforms yourself. #769 (comment)

If you reparameterize a summation linear equality constraint (e.g. x_1 + x_2 + ... + x_n == 1.0) as a linear inequality constraint (x_1 + x_2 + ... + x{n-1} <= 1.0, where x_n = 1.0 - sum(x_i, 0, n-1)) #727 (comment), the constraints are unchanged but Sobol sampling will be biased towards sampling smaller values from x_n. #727 (comment)

xref: #903

Tried this workaround, I actually did not observe a huge bias for the dropped x_n in SOBOL step. However, in my experiment the x_n is hugely unfavored in the following GPEI steps, which is pretty bad..

Having another thought that might work:
We apply no constraint on these n x_i input parameters, except x_i between [0, 1]. Instead, we build a customized kernel function, which is simply adding a normalization step (norm to sum = 1), and use the normalized value to calculate cov matrix.

E.g. say total # of components n = 3:
[0.4, 0.4, 0.4] will have the same cov matrix as [0.6, 0.6, 0.6] after normalization step.

Not quite sure if this workaround will lead to other issues, comments are welcome!

There was also mention of being able to pass linear equality constraints directly to BoTorch, but IIUC you have to take care of the data transforms yourself. #769 (comment)
If you reparameterize a summation linear equality constraint (e.g. x_1 + x_2 + ... + x_n == 1.0) as a linear inequality constraint (x_1 + x_2 + ... + x{n-1} <= 1.0, where x_n = 1.0 - sum(x_i, 0, n-1)) #727 (comment), the constraints are unchanged but Sobol sampling will be biased towards sampling smaller values from x_n. #727 (comment)
xref: #903

Tried this workaround, I actually did not observe a huge bias for the dropped x_n in SOBOL step. However, in my experiment the x_n is hugely unfavored in the following GPEI steps, which is pretty bad..

Does this happen if you swap around your parameters? e.g. instead of $x_n$ being "hidden", "hide" $x_1$ or $x_{n-1}$ as the variable that gets parameterized out, does that variable become largely unfavored as well? (Wondering if it has to do with the underlying response surface or the parameterization)

Having another thought that might work: We apply no constraint on these n x_i input parameters, except x_i between [0, 1]. Instead, we build a customized kernel function, which is simply adding a normalization step (norm to sum = 1), and use the normalized value to calculate cov matrix.

E.g. say total # of components n = 3: [0.4, 0.4, 0.4] will have the same cov matrix as [0.6, 0.6, 0.6] after normalization step.

Not quite sure if this workaround will lead to other issues, comments are welcome!

Have you tried custom kernels in Ax before? I'd be really interested to see a MWE for this if you have one!

@rzc331 @lena-kashtelyan I'm also noticing in a test problem that the dropped $x_n$ is unfavored, not just in Sobol sampling, but also during the BO steps (EDIT: see below).

SearchSpace(
    parameters=[
        RangeParameter(name="filler_A", parameter_type=FLOAT, range=[0.0, 0.7]),
        RangeParameter(name="filler_B", parameter_type=FLOAT, range=[0.0, 0.7]),
        RangeParameter(name="resin_A", parameter_type=FLOAT, range=[0.0, 0.3]),
        RangeParameter(name="resin_B", parameter_type=FLOAT, range=[0.0, 0.3]),
    ],
    parameter_constraints=[
        ParameterConstraint(
            1.0 * filler_A + 1.0 * filler_B + 1.0 * resin_A + 1.0 * resin_B <= 1.0
        ),
        ParameterConstraint(1.0 * filler_A + 1.0 * filler_B <= 0.7),
        ParameterConstraint(1.0 * resin_A + 1.0 * resin_B <= 0.3),
    ],
)
next suggested experiments:
{7: {'filler_A': 0.3119886242471525,
     'filler_B': 0.3419782510706366,
     'resin_A': 0.10490273957092171,
     'resin_B': 0.19509726042907924},
 8: {'filler_A': 0.32154436657521857,
     'filler_B': 0.3784556334247822,
     'resin_A': 0.15266869663515695,
     'resin_B': 0.14733130336484015},
 9: {'filler_A': 0.3899764188071819,
     'filler_B': 0.31002358119282536,
     'resin_A': 0.177623437499831,
     'resin_B': 0.12237656250009948},
 10: {'filler_A': 0.3404039570538693,
      'filler_B': 0.3595960429461242,
      'resin_A': 0.08850167546185723,
      'resin_B': 0.2114983245381318},
 11: {'filler_A': 0.05512362833078018,
      'filler_B': 0.6448763716692211,
      'resin_A': 0.18958296052991244,
      'resin_B': 0.11041703947008807}}

After swapping the 4th and 5th columns (i.e., 4th column is now hidden instead of 5th column), it turns out that the unfavoring of resin_C (which is no longer hidden) goes away, i.e., seems that resin_C was really supposed to have lower values.

SearchSpace(
    parameters=[
        RangeParameter(name="filler_A", parameter_type=FLOAT, range=[0.0, 0.7]),
        RangeParameter(name="filler_B", parameter_type=FLOAT, range=[0.0, 0.7]),
        RangeParameter(name="resin_A", parameter_type=FLOAT, range=[0.0, 0.3]),
        RangeParameter(name="resin_C", parameter_type=FLOAT, range=[0.0, 0.3]),
    ],
    parameter_constraints=[
        ParameterConstraint(
            1.0 * filler_A + 1.0 * filler_B + 1.0 * resin_A + 1.0 * resin_C <= 1.0
        ),
        ParameterConstraint(1.0 * filler_A + 1.0 * filler_B <= 0.7),
        ParameterConstraint(1.0 * resin_A + 1.0 * resin_C <= 0.3),
    ],
)
{7: {'filler_A': 1.142248883083011e-15,
     'filler_B': 0.6999999999999975,
     'resin_A': 2.0796527605945007e-16,
     'resin_C': 0.042927028199531335},
 8: {'filler_A': 0.0,
     'filler_B': 0.7,
     'resin_A': 0.21580837994021526,
     'resin_C': 0.0},
 9: {'filler_A': 0.32307612126559265,
     'filler_B': 0.3769238787353088,
     'resin_A': 0.20389916441375147,
     'resin_C': 0.02516358332796705},
 10: {'filler_A': 0.3958521025989199,
      'filler_B': 0.3041478974010122,
      'resin_A': 0.21072112317762745,
      'resin_C': 0.029101820663395245},
 11: {'filler_A': 0.28591403028831053,
      'filler_B': 0.41408596971167116,
      'resin_A': 0.19647040025440682,
      'resin_C': 0.018133220037209758}}

Note that this was using Models.FULLYBAYESIANMOO with a batch size of 5.

Linear equality constraints with the Service API would still be very nice, but there's at least some support that the model can indirectly infer some of the information about the hidden variable.

There was also mention of being able to pass linear equality constraints directly to BoTorch, but IIUC you have to take care of the data transforms yourself. #769 (comment)

If you reparameterize a summation linear equality constraint (e.g. x_1 + x_2 + ... + x_n == 1.0) as a linear inequality constraint (x_1 + x_2 + ... + x{n-1} <= 1.0, where x_n = 1.0 - sum(x_i, 0, n-1)) #727 (comment), the constraints are unchanged but Sobol sampling will be biased towards sampling smaller values from x_n. #727 (comment)

xref: #903

In reference to this particular workaround, am I correct in assuming that you place the sum(x1...x{n-1}) <= 1.0 constraints in directly in the parameter_constraints keyword (assuming the Service API), while the last constraint (x_n = 1.0 - sum(x_i, 0, n-1)) needs to be imposed during the ax_client.get_next_trial phase?

I wasn't sure how this would work, since parameter values obtained during each generation step is directly from the ax_client, and setting the x_n constraint by hand seems wrong to me.

@Abrikosoff, Ax never sees $x_n$ directly when you do the reparameterization you mentioned. This is why I often refer to it as a "hidden variable." Set the composition_constraint row to True in https://honegumi.readthedocs.io/ to see an example (Colab notebook, permalink.

This is the only place where $x_n$ (in this case, $x_3$) appears in the example:

for _ in range(21):

    parameterization, trial_index = ax_client.get_next_trial()

    # extract parameters
    x1 = parameterization["x1"]
    x2 = parameterization["x2"]
    x3 = total - (x1 + x2)  # composition constraint: x1 + x2 + x3 == total

    results = branin3(x1, x2, x3)
    ax_client.complete_trial(trial_index=trial_index, raw_data=results)

@Abrikosoff, Ax never sees xn directly when you do the reparameterization you mentioned. This is why I often refer to it as a "hidden variable." Set the composition_constraint row to True in https://honegumi.readthedocs.io/ to see an example (Colab notebook, permalink.

This is the only place where xn (in this case, x3) appears in the example:

for _ in range(21):

    parameterization, trial_index = ax_client.get_next_trial()

    # extract parameters
    x1 = parameterization["x1"]
    x2 = parameterization["x2"]
    x3 = total - (x1 + x2)  # composition constraint: x1 + x2 + x3 == total

    results = branin3(x1, x2, x3)
    ax_client.complete_trial(trial_index=trial_index, raw_data=results)

Thanks for the reply! Have also been using Honegumi quite extensively, and it's a marvelous tool as well!