ejmejm/MuZero

A distributed and efficient implementation of MuZero using PyTorch and Ray.

Python

MuZero

by Edan Meyer

TODO

Batch training forward passes
Add Tensorboard metrics
Scale rewards
Add reward supports
Implement Reanalyze
Add prioritized sampling to the replay buffer
Retry loss scaling and higher learning rate (currently prohibited by unstable reward)