sholtodouglas/learning_from_play

Change actions to be relative & in end-effector space (not joint space)

Opened this issue · 0 comments

We hypothesis that relative actions will be easier to learn since the model does not have to learn or account for the DC component of the signal. In the literature it's often observed that normalisation and rescaling of inputs greatly helps with training. From some experiments it seems like relative is noticeably quicker to train (to equivalent loss) and seems to perform better on validation data.

Secondly, we want to combine this with learning actions in cartesian end-effector space (rather than the current robot joint space) as Sholto reckons this gives smoother actions and is less prone to observation noise.

As a side note, relative quaternions seem to be computable via the following: