World Models A3C
Implementation of a variant of World Models
Note
- Replaced MDN-RNN to LSTM for Memory
- Replaced CMA-ES to A3C for Controller
- Trained over two stages
- Stage 1: V and M were trained on dataset with random rollout
- Stage 2: V and M were trained on dataset with a3c rollout
Training Result
Result with dataset using random rollout
Result with dataset using the pretrained model rollout
Play Demo
Environment Setting
apt-get update
apt-get install swig
pip install gym[box2d]
Training Stage I
Dataset Generation using Rollout with random policy
python rollout.py
Vision model with VAE
python train-vae.py
Memory model with LSTM-RNN
python train-rnn.py
Controller with A3C
python train-a3c.py
Training Stage II
Rollout with the pretrained model
python rollout-a3c.py
Fine-tuning V and M with new dataset
vi hparams.py
extra = True
python train-vae.py
python train-rnn.py
Train new C with the improved V and M
python train-a3c.py
Test
# <# of plays> <seed> <is_record>
python test.py 2 999 False