A long-horizon, sparse-reward math environment for reinforcement learning. Official code repo for "What makes Math problems hard for reinforcement learning: A case study".
Primary LanguageJupyter NotebookMIT LicenseMIT
No issues in this repository yet.