mlr-org/mlrMBO

Define constraints for a NumericVector param

vrodriguezf opened this issue · 11 comments

Hi,

I have a NumericVectorParam, called init with the constraint that the sum of all its values must be equal to 1. I wonder how to apply MBO with this constraint. I tried two options:

  1. Generate the design without the constraint and then use the argument trafo in the function makeNumericVectorParam to normalize the parameter so that the sum of its values is equal to 1.

  2. Use forbidden = expression(sum(init) != 1.0) in the function makeParamSet. In this case, the default initial design of mlrMBO cannot find any valid point.

Any help to approach this issue?

Many thanks in advance!

It is very unlikely (p = 0) that by any chance the sum of the components will equal 1, therefore approach 2 is unsuccessful. Basically you are restricting the search space to a area with probability 0.

Approach 1. looks good. The common strategy is to optimize n-1 parameters and trafo it to n parameters.

library(ParamHelpers)

n = 5
ps = makeParamSet(
  makeNumericVectorParam("x", n, lower = 0, upper = 1, trafo = function(x) {
    x[n] = n - sum(x[1:(n-1)])
    return(x / sum(x))
  })
)

x = sampleValue(ps)
trafoValue(ps, x)
As @mb706 noted this solution is wrong. As Values above 1/n are not possible for x[1] to x[n-1]

Please report back if this works.

Hi,

I've tried some experiments with the trafo approach and it works great. I hadn't think about the possibility of optimizing just n-1 values. Good point! :)

Thank you!

If you have n parameters with one restriction you have n-1 degrees of freedom. I might have missed that point in my explanation.

What jacop posted is my usual hack for generating convex combinations of weights. Everything else like using forbidden might either be not perfectly supported by us or simply a bad idea

Ok, got it. The same thing would be applied for multiple constraints right? Just writing them in the trafo function.

Ok, got it. The same thing would be applied for multiple constraints right? Just writing them in the trafo function.

that depends really on what you want to do and what constraints you have. bayesian opt could potentially handle learn / handle complex constrainst, but MBO doesnot support this yet.
but the trafo allow you quite a lot of work arounds, but you have to know / model what you want

Alright, I'll play with the trafo function then. Thanks again for your help!

mb706 commented

Out of curiosity: Does mlrMBO (and tune() in mlr) officially support parameters with length n-1 with a trafo that emit n dimensions, e.g. makeNumericVectorParam("x", n - 1, 0, 1, trafo = function(x) c(x,1))?

@vrodriguezf: Just in case you are going to copy-paste jacob-r's code from above, it may have a bug (values are mapped to [0, 1/n] or [1/n, 1] instead of [0, 1]). If you do a map from n-1 to n dimensions you may have to use something more complicated, depending on how much you care about the range of your parameters and how uniform the derivative of that map is. This depends a lot on how you expect your function to behave with respect to the parameters.

If your n is small and you want to try something simple that may still work use normalisation (trafo = function(x) x / sum(x)) in your transformation, the spurious degree of freedom might not be such a big problem.

It might be worth it, especially for larger n, to try

trafo = function(x) { x = -log(1 - x) ; x / sum(x) }

since you might care about the {sum(x) == 1, 0 < x[i] < 1} simplex being sampled approximately uniformly in the beginning.

Out of curiosity: Does mlrMBO (and tune() in mlr) officially support parameters with length n-1 with a trafo that emit n dimensions, e.g. makeNumericVectorParam("x", n - 1, 0, 1, trafo = function(x) c(x,1))?

At least using sampleValue emits the parameters as expected:

n = 5
ps = makeParamSet(
  makeNumericVectorParam("x", n-1, lower = 0, upper = 1, trafo = function(x) {
    c(x,1)
  })
)
sampleValue(ps, trafo = TRUE)
$x
[1] 0.1218993 0.5609480 0.2065314 0.1275317 1.0000000

It might be worth it, especially for larger n, to try

trafo = function(x) { x = -log(1 - x) ; x / sum(x) }

since you might care about the {sum(x) == 1, 0 < x[i] < 1} simplex being sampled approximately uniformly in the beginning.

@mb706 So, you are trying to enlarge the range of values by using the log, right? Ok I'll test that. the value o n I am using is less than 100. I am trying to put this into a n^2 matrix where the sum of each row must be equal to 1.

Thanks again for your help!

I edited my answer, as @mb706 is right. I guess you should keep the vector of length n and use the log transformation above. After all it should be possible for MBO to handle the redundant solutions.

Here is @mb706 solution:

d = 4
trafo = function(x) {
  x = -log(1 - x) ; x / sum(x)
}
ps = makeParamSet(
  makeNumericVectorParam("x", d, 0, 1, trafo = trafo)
)
fun = makeSingleObjectiveFunction(fn = function(x) prod(x), par.set = ps, minimize = FALSE)
res = mbo(fun = fun, control = makeMBOControl())
res$x
trafo(res$x$x)
res$y

Alternatively this "geometric" approach could also work. The idea is that we have a line of length 1 and x are the position of the n-1 cuts on this line. The result are n pieces with a length of 1 in sum. This is what the trafo calculates.

The only restriction is, that x has to be in ascending order which is defined by the forbidden parameter.

d = 4
trafo = function(x) {
  diff(c(x,1))
}
ps = makeParamSet(
  makeNumericVectorParam("x", d-1, 0, 1, trafo = trafo),
  forbidden = expression(any(diff(x)<0))
)
fun = makeSingleObjectiveFunction(fn = function(x) prod(x), par.set = ps, minimize = FALSE)
res = mbo(fun = fun, control = makeMBOControl())
res$x
trafo(res$x$x)
res$y

The optimal solution is

(y = 1/d)
(x = (1/d)^d)