DDPG

Question

DDPG

Closed this issue 6 years ago · 11 comments

Hey, great work on the implementations! I tried using your DDPG implementation on another environment (BeerGame) and am getting the following error :

Traceback (most recent call last):
File "ddpg.py", line 191, in
main()
File "ddpg.py", line 183, in main
stats = distributor.train(env, args, summary_writer)
File "ddpg.py", line 130, in train
self.update_models(states, actions, critic_target)
File "ddpg.py", line 73, in update_models
self.actor.train(states, actions, np.array(grads).reshape((-1, self.act_dim)))
File "/Users/aravind/Desktop/DDPG/ddpg_actor.py", line 72, in train
self.adam_optimizer([states, grads])
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2666, in call
return self._call(inputs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2635, in _call
session)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2587, in _make_callable
callable_fn = session._make_callable_from_options(callable_opts)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1414, in _make_callable_from_options
return BaseSession._Callable(self, callable_options)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1368, in init
session._session, options_ptr, status)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.OutOfRangeError: Node 'Adam' (type: 'NoOp', num of outputs: 0) does not have output 0
Exception ignored in: <bound method BaseSession._Callable.del of <tensorflow.python.client.session.BaseSession._Callable object at 0x119e7f630>>
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1398, in del
self._session._session, self._handle, status)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: No such callable handle: 140324910323680

I only made a few modifications to your code to suit my environment like state and action dimensions. This error occurs during training in Adam optimizer, when the action gradients from the critic are being propagated to the actor network. Was wondering if you encountered any similar errors during your implementation.

Answer 1 · 2018-10-11T05:45:04.000Z

Hi, I don't recall seeing such error when implementing DDPG. I am not familiar with BeerGame, although I would assume the error comes from the [states, grads] you feed in the optimizer.
Also, have you checked that all the dimensions in optimizer() of actor.py are correct ?

Answer 2 · 2018-10-11T14:37:54.000Z

Thanks for the quick response! I thought along the same lines and I did check all shapes inside the optimizer function and its inputs. Everything seems to be fine :

action_gdts.shape : (?, 1)
params_grad.shape :(8,)
trainable_weights.shape : (8,)
grads : <zip object at 0x1177d9d48> shape : ()
state shape : (64, 6) action shape : (64, 1) grads shape : (64, 1)

Answer 3 · 2018-10-12T19:29:23.000Z

I just tried the 'MountainCarContinuous-v0' environment with no modifications to your code and it throws the same error. Found this env as an example in your continuous environment wrapper so I'm assuming you tried it?

Answer 4 · 2018-10-15T12:05:55.000Z

Thank you for the feedback, I just tried running DDPG with MountainCarContinuous-v0 and it runs fine on my computer. Are you using Keras 2.1.6 ?

Answer 5 · 2018-10-15T15:03:11.000Z

Fixed it. I am using Keras 2.2.2 and there was a minor change in the way K.function works after ~2.1.6.

In the optimizer function in actor.py, I had to change
return K.function([self.model.input, action_gdts], [tf.train.AdamOptimizer(self.lr).apply_gradients(grads)])
to
return K.function([self.model.input, action_gdts], [tf.train.AdamOptimizer(self.lr).apply_gradients(grads)][1:])

Including the input layer in the output of K.function threw the error. Just removed the input layer from the output placeholder and it works like a charm. Thanks a lot for helping out!

Answer 6 · 2018-10-15T15:12:59.000Z

Great! Glad to hear that was the error

Answer 7 · 2019-04-20T10:19:33.000Z

So this is not actually correct either - In Keras 2.2.2 and above K.function is a bit different:

K.function(inputs, outputs, updates)

This is the correct function:
K.function(inputs=[self._state, self._action_grads], outputs=[], updates=[tf.train.AdamOptimizer(self._learning_rate).apply_gradients(grads)])

I was initially caught out by this - I couldn't work out why the model wasn't updating - this is why.

Answer 8 · 2019-05-23T16:52:03.000Z

@sverzijl:

K.function(inputs=[self._state, self._action_grads], outputs=[], updates=[tf.train.AdamOptimizer(self._learning_rate).apply_gradients(grads)])

I am having similar issues. What is self._state? I didn't find it in actor.py.
In the current documentation for K.function it says it should be placeholder tensors for input

Answer 9 · 2019-05-23T17:02:36.000Z

I got it, it was missing the slice at the end of gradients. this is what worked for me:

K.function(inputs=[self.model.input, action_gdts], outputs=[],
                          updates=[tf.train.AdamOptimizer(self.learning_rate).apply_gradients(grads)][1:])

Disclaimer: I am adapting the code for a different problem.

Answer 10 · 2020-04-04T17:21:03.000Z

Funnily enough, all of these solutions didn't work for me. The problem for me was that outputs could not be an empty list [] (all original code), so I ended up adding a random variable and everything works.

K.function(inputs=[self.model.input, action_gdts], outputs=[ K.constant(1)],updates=[tf.train.AdamOptimizer(self.lr).apply_gradients(grads)])

Answer 11 · 2021-12-26T20:38:30.000Z

Hello everyone,
I got a similar error, but none of those mentioned above solutions work for me.
Does anybody have a recommendation?