On solving bound constrained sub-problems for an overall method
Closed this issue · 7 comments
Dear Admin,
I am working on a method in continuous optimization, where I have to solve a sequence of sub-problems of the form (min f(x) s.t. F(x,z,t) = 0, z >= 0) depending on a parameter t. For a test problem, in some inner iterations of an outer iteration, I get the following error:
ERROR: LoadError: outside of the trust region: ‖x‖²= NaN, Δ²=1.0e+04
What can I change in order to bypass this problem.
Thank you.
Here the test problem I want to solve:
function test()
cc = [10.0, 8.0, 6.0, 4.0, 2.0]
K = [5.0, 5.0, 5.0, 5.0, 5.0]
b = [1.2, 1.1, 1.0, 0.9, 0.8]
L = 150.0
g = 1.0
gg = 5000.0 ^ (1 / g)
x_last = Array{Float64,1}(UndefInitializer(), 37)
x_last[1] = 75.0
# the parameter t is rp, and varies in a for loop from 1.0 to 10.0^-10
for ii in [0:10;]
rp = 10.0^-ii
(x0, m, f, lvar, uvar, c) = (
x_last,
28,
x -> cc[1] * x[1] + b[1] / (b[1] + 1) * K[1] ^ (-1 / b[1]) * x[1] ^ ((1 + b[1]) / b[1]) - x[1] *(gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)),
zeros(37),
vcat(L*ones(5), Inf*ones(32)),
c = x -> [
(cc[2] + K[2] ^ (-1 / b[2]) * x[2]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[2] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[6] - x[7]);
(cc[3] + K[3] ^ (-1 / b[3]) * x[3]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[3] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[8] - x[9]);
(cc[4] + K[4] ^ (-1 / b[4]) * x[4]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[4] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[10] - x[11]);
(cc[5] + K[5] ^ (-1 / b[5]) * x[5]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[5] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[12] - x[13]);
# COMPLEMENTARITY n = 13, ni = 2, nc = 8
x[2] - x[14]; L - x[2] - x[15]; x[3] - x[16]; L - x[3] - x[17];
x[4] - x[18]; L - x[4] - x[19]; x[5] - x[20]; L - x[5] - x[21];
x[6] + rp - x[22]; x[7] + rp - x[23]; x[8] + rp - x[24]; x[9] + rp - x[25];
x[10] + rp - x[26]; x[11] + rp - x[27]; x[12] + rp - x[28]; x[13] + rp - x[29];
(x[2] - rp) * x[6] + x[30]; (L - x[2] - rp) * x[7] + x[31];
(x[3] - rp) * x[8] + x[32]; (L - x[3] - rp) * x[9] + x[33];
(x[4] - rp) * x[10] + x[34]; (L - x[4] - rp) * x[11] + x[35];
(x[5] - rp) * x[12] + x[36]; (L - x[5] - rp) * x[13] + x[37]
]
)
println(x_last);
nlp = ADNLPModel(f, x0, lvar, uvar, c, zeros(m), zeros(m));
output = with_logger(NullLogger()) do
percival(nlp, subsolver_logger = NullLogger())
end
println("OUTPUT $(ii) : ", output)
x_last = output.solution
end
end
Hi @TashTatut, thanks for trying to use Percival.jl, I hope we can help you.
I reformatted your problem and pasted as gist to make it easier to read and run: https://gist.github.com/abelsiqueira/456febd8395d6420e5ddfd7ad7ee8dd8. This is already the "fixed" version.
About what's the issue. It appears that on the search for a solution, the method eventually tries something with x[1:5] = 0
, which leads to Inf
and then probably to NaN
. To test this, I've changed the bounds to [1e-6 * ones(5); zeros(32)]
, which means that x[1:5] > 0
. Unless you expect your solution to have a small value for any x[1:5]
, this should be safe. It worked after that.
TL;DR: Change lvar
to [1e-6 * ones(5); zeros(32)]
.
Let me know if this works.
I investigate a little bit too and Inf
and NaN
appear during a hessian-vector product (https://github.com/amontoison/JSOSolvers.jl/blob/master/src/tron.jl#L304). I suspect that the problem comes from ForwardDiff, which has difficulties to derive your model for extreme values.
I tested with an other backend for automatic differentiation and I didn't encounter any NaN
or Inf
.
using NLPModels, Percival, Logging, JuMP, NLPModelsJuMP
function pb(rp)
cc = [10.0, 8.0, 6.0, 4.0, 2.0]
K = [5.0, 5.0, 5.0, 5.0, 5.0]
b = [1.2, 1.1, 1.0, 0.9, 0.8]
L = 150.0
g = 1.0
gg = 5000.0 ^ (1 / g)
lvar = zeros(37)
uvar = vcat(L*ones(5), Inf*ones(32))
x0 = zeros(37)
x0[1] = 75.0
nlp = Model()
@variable(nlp, lvar[i] <= x[i=1:37] <= uvar[i], start=x0[i])
@NLconstraint(nlp, (cc[2] + K[2] ^ (-1 / b[2]) * x[2]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[2] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[6] - x[7]) == 0)
@NLconstraint(nlp, (cc[3] + K[3] ^ (-1 / b[3]) * x[3]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[3] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[8] - x[9]) == 0)
@NLconstraint(nlp, (cc[4] + K[4] ^ (-1 / b[4]) * x[4]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[4] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[10] - x[11]) == 0)
@NLconstraint(nlp, (cc[5] + K[5] ^ (-1 / b[5]) * x[5]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[5] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[12] - x[13]) == 0)
@constraint(nlp, x[2] - x[14] == 0)
@constraint(nlp, L - x[2] - x[15] == 0)
@constraint(nlp, x[3] - x[16] == 0)
@constraint(nlp, L - x[3] - x[17] == 0)
@constraint(nlp, x[4] - x[18] == 0)
@constraint(nlp, L - x[4] - x[19] == 0)
@constraint(nlp, x[5] - x[20] == 0)
@constraint(nlp, L - x[5] - x[21] == 0)
@constraint(nlp, x[6] + rp - x[22] == 0)
@constraint(nlp, x[7] + rp - x[23] == 0)
@constraint(nlp, x[8] + rp - x[24] == 0)
@constraint(nlp, x[9] + rp - x[25] == 0)
@constraint(nlp, x[10] + rp - x[26] == 0)
@constraint(nlp, x[11] + rp - x[27] == 0)
@constraint(nlp, x[12] + rp - x[28] == 0)
@constraint(nlp, x[13] + rp - x[29] == 0)
@NLconstraint(nlp, (x[2] - rp) * x[6] + x[30] == 0)
@NLconstraint(nlp, (L - x[2] - rp) * x[7] + x[31] == 0)
@NLconstraint(nlp, (x[3] - rp) * x[8] + x[32] == 0)
@NLconstraint(nlp, (L - x[3] - rp) * x[9] + x[33] == 0)
@NLconstraint(nlp, (x[4] - rp) * x[10] + x[34] == 0)
@NLconstraint(nlp, (L - x[4] - rp) * x[11] + x[35] == 0)
@NLconstraint(nlp, (x[5] - rp) * x[12] + x[36] == 0)
@NLconstraint(nlp, (L - x[5] - rp) * x[13] + x[37] == 0)
@NLobjective(nlp, Min, cc[1] * x[1] + b[1] / (b[1] + 1) * K[1] ^ (-1 / b[1]) * x[1] ^ ((1 + b[1]) / b[1]) - x[1] * (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)))
return MathOptNLPModel(nlp)
end
for ii in [0:10;]
rp = 10.0^-ii
nlp = pb(rp)
output = with_logger(NullLogger()) do
percival(nlp, subsolver_logger = NullLogger())
end
println("OUTPUT $(ii) : ", output)
x_last = output.solution
end
Hi @abelsiqueira, hi @amontoison . Thanks for interacting with me. The requirement of the algorithm is to start the next iteration with the previous iteration's output solution. I've tested the other backend of automatic differentiation proposed by @amontoison for other model examples -since I need to validate the solution method for a collection of test models- and it worked. For this model example, here is the loop used, where I modified the pb() function to take sol as a starting point. I get the same error in iteration 4. When modifying the lower bound lvar
to [1e-6 * ones(5); zeros(32)]
, the error occurs in the first iteration. As information, when using Lancelot solver with AMPL langage from NEOS server https://neos-server.org/neos/solvers/nco:LANCELOT/AMPL.html, the problem is solved. Thanks.
`
for ii in [1:10;]
rp = 10.0^-ii
global sol
if ii == 0
sol = zeros(37)
sol[1] = 75.0
end
nlp = pb(sol, rp)
output = with_logger(NullLogger()) do
percival(nlp, subsolver_logger = NullLogger())
end
cp = deepcopy(output)
sol = cp.solution
println("OUTPUT $(ii + 1) : ", output)
end
`
@TashTatut, I ran the following code with Julia 1.5 and I didn't encounter error :
using NLPModels, Percival, Logging, JuMP, NLPModelsJuMP
function pb(sol, rp)
cc = [10.0, 8.0, 6.0, 4.0, 2.0]
K = [5.0, 5.0, 5.0, 5.0, 5.0]
b = [1.2, 1.1, 1.0, 0.9, 0.8]
L = 150.0
g = 1.0
gg = 5000.0 ^ (1 / g)
lvar = zeros(37)
uvar = vcat(L*ones(5), Inf*ones(32))
nlp = Model()
@variable(nlp, lvar[i] <= x[i=1:37] <= uvar[i], start=sol[i])
@NLconstraint(nlp, (cc[2] + K[2] ^ (-1 / b[2]) * x[2]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[2] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[6] - x[7]) == 0)
@NLconstraint(nlp, (cc[3] + K[3] ^ (-1 / b[3]) * x[3]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[3] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[8] - x[9]) == 0)
@NLconstraint(nlp, (cc[4] + K[4] ^ (-1 / b[4]) * x[4]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[4] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[10] - x[11]) == 0)
@NLconstraint(nlp, (cc[5] + K[5] ^ (-1 / b[5]) * x[5]) - (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)) - x[5] * (-1 / g * gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 - 1 / g)) - (x[12] - x[13]) == 0)
@constraint(nlp, x[2] - x[14] == 0)
@constraint(nlp, L - x[2] - x[15] == 0)
@constraint(nlp, x[3] - x[16] == 0)
@constraint(nlp, L - x[3] - x[17] == 0)
@constraint(nlp, x[4] - x[18] == 0)
@constraint(nlp, L - x[4] - x[19] == 0)
@constraint(nlp, x[5] - x[20] == 0)
@constraint(nlp, L - x[5] - x[21] == 0)
@constraint(nlp, x[6] + rp - x[22] == 0)
@constraint(nlp, x[7] + rp - x[23] == 0)
@constraint(nlp, x[8] + rp - x[24] == 0)
@constraint(nlp, x[9] + rp - x[25] == 0)
@constraint(nlp, x[10] + rp - x[26] == 0)
@constraint(nlp, x[11] + rp - x[27] == 0)
@constraint(nlp, x[12] + rp - x[28] == 0)
@constraint(nlp, x[13] + rp - x[29] == 0)
@NLconstraint(nlp, (x[2] - rp) * x[6] + x[30] == 0)
@NLconstraint(nlp, (L - x[2] - rp) * x[7] + x[31] == 0)
@NLconstraint(nlp, (x[3] - rp) * x[8] + x[32] == 0)
@NLconstraint(nlp, (L - x[3] - rp) * x[9] + x[33] == 0)
@NLconstraint(nlp, (x[4] - rp) * x[10] + x[34] == 0)
@NLconstraint(nlp, (L - x[4] - rp) * x[11] + x[35] == 0)
@NLconstraint(nlp, (x[5] - rp) * x[12] + x[36] == 0)
@NLconstraint(nlp, (L - x[5] - rp) * x[13] + x[37] == 0)
@NLobjective(nlp, Min, cc[1] * x[1] + b[1] / (b[1] + 1) * K[1] ^ (-1 / b[1]) * x[1] ^ ((1 + b[1]) / b[1]) - x[1] * (gg * (x[1] + x[2] + x[3] + x[4] + x[5]) ^ (-1 / g)))
return MathOptNLPModel(nlp)
end
sol = zeros(37)
sol[1] = 75.0
for ii in [1:10;]
println(sol)
rp = 10.0^-ii
nlp = pb(sol, rp)
output = with_logger(NullLogger()) do
percival(nlp, subsolver_logger = NullLogger())
end
cp = deepcopy(output)
sol = cp.solution
println("OUTPUT $(ii + 1) : ", output)
end
Thanks @amontoison for your response. It's working for me now. I still have to do some further work with the algorithm. I want to modify the algorithm to take into account a strong approximate stationarity. I have installed the package but I do not have a local version to work with it. I want to get to the file of the outer iteration (percival.jl) and of the inner iteration (tron.jl) to modify them and test the overall method again. Can you help me to get to the code of the algorithm and implement the modification and test it? Thanks again for your interest in my work.
@TashTatut Please use our Slack channels for questions. Thank you.
Closing as the original scope is done.