An unofficial implementation of MEME (Efficient Memory-based Exploration agent) from DeepMind
- Fix prioritized experience replay
- Fix burnin functionality
- Fix code for big burnin and rollout hyperparameter
- Find bugs and test for correctness
- Bootstrapping with online network.
- Target computation with tolerance.
- Loss and priority normalization.
- Cross-mixture training.
- Normalizer-free torso network.
- Shared torso with combined loss.
- Robustifying behavior via policy distillation.
@article{kapturowski2022human,
title={Human-level Atari 200x faster},
author={Kapturowski, Steven and Campos, V{\'\i}ctor and Jiang, Ray and Raki{\'c}evi{\'c}, Nemanja and van Hasselt, Hado and Blundell, Charles and Badia, Adri{\`a} Puigdom{\`e}nech},
journal={arXiv preprint arXiv:2209.07550},
year={2022}
}