/MuZero

A distributed and efficient implementation of MuZero using PyTorch and Ray.

Primary LanguagePython

MuZero

by Edan Meyer

TODO

  • Batch training forward passes
  • Add Tensorboard metrics
  • Scale rewards
  • Add reward supports
  • Implement Reanalyze
  • Add prioritized sampling to the replay buffer
  • Retry loss scaling and higher learning rate (currently prohibited by unstable reward)