Question: Different results for Boxconstraints with torch.clamp vs cvxpy layers
philippkiesling opened this issue · 0 comments
philippkiesling commented
Hi everybody,
I am training a reinforcement learning agent with one dimensional continuous action space. I want to restrict the action space, such that the agent can only choose valid actions (This constraint changes depending on the current state).
Therefore I tried two approaches to limiting the action space:
- torch.clamp, which computes:
$y_i=min(max(x_i, min_{value}), max_{value})$
https://pytorch.org/docs/stable/generated/torch.clamp.html - A cvxpy layer, solving the box constraint.
Forward pass and gradient are the same, which I double-checked by plotting the gradients, too.
To my surprise, however, they yield completely different results: The torch.clamp version does not converge to a good solution, while the cvxpylayer does.
Is there any explanation for this?
All the best,
Philipp
The gradient of both functions
Forward pass of both functions