Add additional rewards to the env info field
Closed this issue · 0 comments
rickstaa commented
In order to be able to use the LSAC
algorithm, I need to supply the algorithm with the following additional fields:
rewards
:l_rewards
:violation_of_constraint
: Whether the constraints were violated.
L_rewards
l_rewards = 20* max((abs(x)-self.cons_pos), 0)**2/100 #+ 20 *(max((abs(theta)-0.8*self.theta_threshold_radians), 0)/ self.theta_threshold_radians)**2
violation_of_constraint
if abs(x)>self.cons_pos:
violation_of_constraint = 1
else:
violation_of_constraint = 0