Proximal Policy Optimization implementation with Tensorflow.
https://arxiv.org/pdf/1707.06347.pdf
- Python3
- tensorflow
- gym[atari]
- opencv-python
- git+https://github.com/imai-laboratory/lightsaber
$ python train.py --gpu {0 or -1} --render --final-steps 10000000
$ python play.py --gpu {0 or -1} --render --load {path of models}
This is inspired by following projects.