StepNeverStop/RLs

about using gumbel_distribution to transform discrete space

Closed this issue · 8 comments

In the code you provided, the DDPG algorithm supports continuous and discrete action spaces by using the Gumbel_distribution. Maddpg is a DDPG-based extension, and whether it is suitable for discrete action spaces by using Gumbel_distribution. when i employ Gumbel in MADDPG, i can not obtain appropriate results. the version of tensorflow i used is 1.14, i don't use the tensorflow_probability module, could you give me some code exmples of Gumbel in TF, or give me some instructions? sorry to bother you.

I'm not sure about whether Gumble is suitable for MADDPG.
Actually, maddpg in my repo does not work well for now, and I'll fix it next.
You can find something related to Gumble distribution at here
I'm not familiar with how to use Gumble with TF1.x, maybe you could find related answers in others' repo.

thanks for your replying. in the MADDPG code supplied by OpenAI (https://github.com/openai/maddpg/), they use Gumbel sample in distributons.py as follows:
def sample(self):
u = tf.random_uniform(tf.shape(self.logits))
return U.argmax(self.logits - tf.log(-tf.log(u)), axis=1)
may be the MADDPG is able to address the discrete action spaces...

yes, Gumbel softmax is a good trick in solving gradient conduction discrete problem gradients.

Since you have found the solution, this issue will now be closed, feel free to re-open it.

when you employ Gumbel softmax in DDPG solving discrete action spaces problem, did you get the desired outcome?

@tanxiangtj yes, it works well with gym

您好!您在上海?您这个工作很不错,方便给个微信联系方式吗,向您请教请教

您好!您在上海?您这个工作很不错,方便给个微信联系方式吗,向您请教请教

如果您有什么问题,可以通过email或者在issue中联系我。

好的。谢谢!如果您解决了MADDPG在离散动作空间的应用,请指导我一下。