Feature request: allow for constraints on ordered choice parameters
lena-kashtelyan opened this issue · 6 comments
Original comment by @LuddeWessen:
I was about to post an issue report on this as well, since I need to put constraints on choice variables, especially between them.
I believe this should be possible for at least two reasons:
If the parameters have internal ordering we often need to put constraints as if it were range parameter.
Sometimes categorical or ordered parameters needs to be constrained to remove "symmetries" in the parameter space, e.g. when parameters (1,2) actually means the same as (2,1), we want to disallow one of them.
I believe that the fix suggested by @Balandat Jul 17 sounds good.
Proposed solution: #621 (comment)
Merged into master wishlist issue
@LuddeWessen what is the fix you referred to? (pulling from #621 (comment))
So there shouldn't be a fundamental issue including fixed parameters, as we can just absorb their value into a new bound. That said, if the problem is simple enough, you can just do this yourself. Instead of
't1 - lw >= 0'
you'd just providef'-lw <= {-t1}'
. If you have a lot of constraints then this may be a bit tedious, but it should always be possible. On our end, we could consider properly parsing out fixed parameters from the constraints and do this under the hood, we'd have to look at how much work that would be relative to the benefit it provides.
or
As far as choice parameters are concerned, the reason we didn't allow constraints on these was that they were one-hot encoded and thus representing the constraint as a linear constraint is not straightforward. However, we made some changes recently that handle choice parameters in a different way and optimize over their actual values, so it would potentially be possible to incorporate them into parameter constraints (in fact, potentially even nonlinear ones).
Why do you think (one of these) would address the issue of symmetry that you mentioned?
when parameters (1,2) actually means the same as (2,1), we want to disallow one of them
I'm asking because it seems relevant to a possible implementation for my use-case (#727 (comment) see "Components/Composition version"). I've been thinking about doing data augmentation instead, where you just supply Ax with the degenerate cases every time a new trial is added.
The number of symmetrical cases can start to get very large. In the case of n=3
where n
is the number of distinct objects:
{{1, 2, 3}, {1, 3, 2}, {2, 1, 3}, {2, 3, 1}, {3, 1, 2}, {3, 2, 1}}
would all be considered equivalent, meaning that n! - n
data augmentation points would be added. I'm not sure when Ax starts to get sluggish (maybe tens of thousands of points? @lena-kashtelyan), but this could be a reasonable approach when n
is small (e.g. n < 7
) and/or the initial dataset is small (~O 100
points) and depending on the reasonable upper limits of # points in Ax models.
@sgbaird I meant the latter one, as I believe nonlinear constraints will be needed to restrict the search space for many use cases.
My thought is when we allow for non-linear constraint for categorical variables it would open up for implementation of user-defined constraints, or even hook up CSP solvers such as Google OR-tools that are available in Python.
Mathematically your approach of data augmentation seems correct. However, I am not sure if it is the right way to go since the underlying method scales poorly with increased number of data points. However, I am no GP expert.
Regarding symmetries, in your previous post you outline an approach to breaking symmetries. In my understanding your problem formulation called "components/composition" seems to be a clever one.
However, rather, instead of posting the constraint "composition1 > composition2 > composition3", don't you want to post "object1 > object2 > object3" instead?
Then, whenever Ax samples values for the choice parameters (object1,object2,object3), only one out of all symmetrical solutions are disallowed. I.e. if we have internal ordering of the values A,B,C s.t. A>B>C, then (A,B,C) is a solution for (object1,object2,object3) variables. However, all other permutations* are infeasible, and thus those experiments are not suggested. (*The infeasible permutations being (A,C,B), (B,A,C), (B,C,A), (C,A,B),(C,B,A)).
@LuddeWessen, thank you for clarifying.
I had some experience with a previous project related to data augmentation in a GPR model (https://github.com/sgbaird-5DOF/interp). In that case where there was a lot of symmetry and therefore many degenerate cases, the model improved by including those nearby cases where the improvement was especially near the borders of the "fundamental zone" (one contiguous region of non-degenerate points), but that kind of approach was restricted to small datasets.
I like your idea of imposing constraints on the components rather than the compositions. A quick test showed that I won't be able to use string variables for an order constraint on the choice parameters:
Exception has occurred: ValueError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
Parameter constraints only supported for numeric parameters.
File "C:\Users\sterg\miniconda3\envs\[my-env]\Lib\site-packages\ax\core\parameter_constraint.py", line 266, in validate_constraint_parameters
raise ValueError(
File "C:\Users\sterg\miniconda3\envs\[my-env]\Lib\site-packages\ax\core\parameter_constraint.py", line 122, in __init__
validate_constraint_parameters([lower_parameter, upper_parameter])
File "C:\Users\sterg\miniconda3\envs\[my-env]\Lib\site-packages\ax\service\utils\instantiation.py", line 283, in constraint_from_str
else OrderConstraint(
File "C:\Users\sterg\miniconda3\envs\[my-env]\Lib\site-packages\ax\service\utils\instantiation.py", line 489, in <listcomp>
constraint_from_str(c, parameter_map) for c in parameter_constraints
File "C:\Users\sterg\miniconda3\envs\[my-env]\Lib\site-packages\ax\service\utils\instantiation.py", line 488, in make_search_space
typed_parameter_constraints = [
File "C:\Users\sterg\miniconda3\envs\[my-env]\Lib\site-packages\ax\service\utils\instantiation.py", line 663, in make_experiment
search_space=make_search_space(parameters, parameter_constraints or []),
File "C:\Users\sterg\miniconda3\envs\[my-env]\Lib\site-packages\ax\service\ax_client.py", line 305, in create_experiment
experiment = make_experiment(
File "C:\Users\sterg\Documents\GitHub\sparks-issa\[my-env]\[my-file].py", line 137, in <module>
ax_client.create_experiment(
However, if I convert these to ordinal categorical variables I think that the code would run fine. I'm just not sure if it would matter how the numeric values are assigned to the categories. For example, should choices that are similar to each other be closer in numeric value to each other, or does it not matter? (I'm guessing it does, see #750)
I think the only leftover degeneracy in the space that might need to be considered is the case where the same component is chosen multiple times. For example, component1==component2==component3
or component1==component3
is feasible (e.g. AAA
or ABA
) when on our side we know that this actually simplifies to 3*A
(1-component system) or 2*A+B
(2-component system). Trying to remove these degeneracies would (I think) manifest itself as another non-linear constraint, easily represented by if
statements and set
differences, but probably not possible using a linear constraint. For this one, data augmentation is one option, but I don't remember off the top of my head how degeneracy scales with n
in this case (might update this comment later).
Based on @Balandat 's #750 (comment), I tried to make a MWE (now just a reproducer 😓, see colab notebook). As suggested, ordered constraints can't be imposed on ordinal (choice) parameters, even if they're numeric:
/usr/local/lib/python3.7/dist-packages/ax/service/ax_client.py in create_experiment(self, parameters, name, objective_name, minimize, objectives, parameter_constraints, outcome_constraints, status_quo, overwrite_existing_experiment, experiment_type, tracking_metric_names, choose_generation_strategy_kwargs, support_intermediate_data, immutable_search_space_and_opt_config, is_test)
292 immutable_search_space_and_opt_config=immutable_search_space_and_opt_config,
293 is_test=is_test,
--> 294 **objective_kwargs,
295 )
296 self._set_experiment(
/usr/local/lib/python3.7/dist-packages/ax/service/utils/instantiation.py in make_experiment(parameters, name, parameter_constraints, outcome_constraints, status_quo, experiment_type, tracking_metric_names, objective_name, minimize, objectives, objective_thresholds, support_intermediate_data, immutable_search_space_and_opt_config, is_test)
616 return Experiment(
617 name=name,
--> 618 search_space=make_search_space(parameters, parameter_constraints or []),
619 optimization_config=optimization_config,
620 status_quo=status_quo_arm,
/usr/local/lib/python3.7/dist-packages/ax/service/utils/instantiation.py in make_search_space(parameters, parameter_constraints)
479
480 typed_parameter_constraints = [
--> 481 constraint_from_str(c, parameter_map) for c in parameter_constraints
482 ]
483
/usr/local/lib/python3.7/dist-packages/ax/service/utils/instantiation.py in <listcomp>(.0)
479
480 typed_parameter_constraints = [
--> 481 constraint_from_str(c, parameter_map) for c in parameter_constraints
482 ]
483
/usr/local/lib/python3.7/dist-packages/ax/service/utils/instantiation.py in constraint_from_str(representation, parameters)
274 lower_parameter=parameters[left], upper_parameter=parameters[right]
275 )
--> 276 if COMPARISON_OPS[tokens[1]] is ComparisonOp.LEQ
277 else OrderConstraint(
278 lower_parameter=parameters[right], upper_parameter=parameters[left]
/usr/local/lib/python3.7/dist-packages/ax/core/parameter_constraint.py in __init__(self, lower_parameter, upper_parameter)
120 [1, -1] * [p1, p2]^T <= 0.
121 """
--> 122 validate_constraint_parameters([lower_parameter, upper_parameter])
123
124 self._lower_parameter = lower_parameter
/usr/local/lib/python3.7/dist-packages/ax/core/parameter_constraint.py in validate_constraint_parameters(parameters)
276 # Ax models only support linear constraints.
277 if isinstance(parameter, ChoiceParameter):
--> 278 raise ValueError("Parameter constraints not supported for ChoiceParameter.")
279
280 # Log parameters require a non-linear transformation, and Ax
ValueError: Parameter constraints not supported for ChoiceParameter.
It looks like data augmentation is probably the right solution for me (small dataset, straightforward implementation).
@LuddeWessen curious if you've seen anything that studies the effect of these symmetries on search efficiency? e.g. ignoring vs. removing vs. data augmenting the symmetries