Actor update equation in DDPG

Question

Actor update equation in DDPG

Closed this issue 6 years ago · 2 comments

Hi,

When you use the gradient of the critic to update the actor here, why do you put in the third parameter of tf.gradients() "-action_gdts" instead of "action_gdts". From where does the minus sign come ?

I double checked the formula and I still don't see why it is the case in your code.

Thanks!

Deleted user commented 6 years ago

Thanks!

Answer 1 · 2018-11-19T16:15:37.000Z

Apologies for the late response, the negation here is a simple trick to compute the gradients using the loss function of the actor's network. This is similar to OpenAI's baselines