This is DQN implemented with tensorflow 2. DQN implemented in two version. The original DQN (Nature 2015) and multi-step DQN. The experiment has done under the OpenAI gym cartpole-v1 environment.
- Python 3.8.2
- tensorflow 2.2.0rc2
- matplotlib 3.2.1
Training only original DQN.
> python script.py orgDQN
Training original DQN and multi-step DQN in a sequence.
> python script.py orgDQN multistep
If the average step count exceed 475, the training ends early.
model | end episode | final average step |
---|---|---|
Original DQN | 488 | 475.39 |
Multi-step DQN (n=3) | 303 | 476.4 |