rickstaa/stable-learning-control

Add additional rewards to the env info field

Closed this issue · 0 comments

In order to be able to use the LSAC algorithm, I need to supply the algorithm with the following additional fields:

  • rewards:
  • l_rewards:
  • violation_of_constraint: Whether the constraints were violated.

L_rewards

See https://github.com/hithmh/Actor-critic-with-stability-guarantee/blob/734d49c70c8f732e33617924446df2a656b8608b/ENV/env/classic_control/ENV_V0.py#L156

 l_rewards = 20* max((abs(x)-self.cons_pos), 0)**2/100 #+ 20 *(max((abs(theta)-0.8*self.theta_threshold_radians), 0)/ self.theta_threshold_radians)**2

violation_of_constraint

        if abs(x)>self.cons_pos:
            violation_of_constraint = 1
        else:
            violation_of_constraint = 0