This repository contains several policy gradient implementations (as well as RQN) of Puck_Catching_Agent designed in physics_learning_rl. For different algorithms' experiment, evaluation and optimization.
Using the same simulator as used in physics_learning_rl.
-
RQN (Recurrent Q-Network) with LSTM
This is the pytorch version of the RQN Puck_Catching_Agent (the pretrined-model in transfer learning framework) used in physics_learning_rl.
-
Actor-Critics
Update A-C networks during each step, both with LSTM. Critics predicts action value.
-
Advantage Actor-Critics (A2C)
Update A-C networks after a whole episode, both with LSTM. Critics predicts state value.
Multi-agent framework:
- Let agents learn to catch a same puck.
- team-work to control other pucks to push a target puck to goal.
- python 3.7+
- pytorch 1.2.0+
- pyduktape 0.0.6