Question: Different results for Boxconstraints with torch.clamp vs cvxpy layers

Question

Question: Different results for Boxconstraints with torch.clamp vs cvxpy layers

philippkiesling opened this issue 2 years ago · 0 comments

Hi everybody,

I am training a reinforcement learning agent with one dimensional continuous action space. I want to restrict the action space, such that the agent can only choose valid actions (This constraint changes depending on the current state).

Therefore I tried two approaches to limiting the action space:

torch.clamp, which computes: $y_i=min(max(x_i, min_{value}), max_{value})$
https://pytorch.org/docs/stable/generated/torch.clamp.html
A cvxpy layer, solving the box constraint.
Forward pass and gradient are the same, which I double-checked by plotting the gradients, too.

To my surprise, however, they yield completely different results: The torch.clamp version does not converge to a good solution, while the cvxpylayer does.
Is there any explanation for this?

All the best,
Philipp

The gradient of both functions

Forward pass of both functions