The repository contain the implementation of A3C, A2C, DDQN, and REINFORCE(naive) with Tensorflow2.0. Some of them have been demostrated in the OpenAI Cart Pole environment.
In additon, it modulizes the API of environments(Cart Pole, Flappy Bird, and remote environment), exploration stratgies(although I still working on it). The remote environment even allows the agent to connect to the external server and interact with them.
Still working on modulizing, but here is the DEMO on OpenAI Cart Pole. I use Master-Slave strategy(which is similar to the parameter server strategy in TensorFlow1) with TensorFlow2.0 and Multiprocessing.py to implement. Worker send the updated gradients to the master, and the master receive the updated gradients and apply them to the global model. The master also keeps send the lastest model variables to the workers.
However, TensorFlow2.0 has removed the tf.Session()
which can allocate the computation task to specific device. Therefore, I use with tf.device()
to specify the device of the task.
For more detail, please read the doc: Tricks of A3C on TensorFlow2 + Multiprocessing
and here is the DEMO
Tensorflow DEMO on Cart Pole Tensorflow DEMO on Flappy Bird
Implementation of Actor-Critic Network
Implementation of Doubly Deep Q-Network with Tensorflow
Working on it
Working on it
Integrate the API of different environments.
From OpenAI Cart Pole.
From PLE Flappy Bird
Create a TCP client and connect to the provided server. You can see the DEMO and the details in another repo(RL Java Integral) which we implement a Java multi-threading server and interact with the A3C model. Thanks tom1236868 for implementing the Java server.
Reference: