/mastering-urlb

[ICML 2023] Pre-train world model-based agents with different unsupervised strategies, fine-tune the agent's components selectively, and use planning (Dyna-MPC) during fine-tuning.

Primary LanguagePythonMIT LicenseMIT

Watchers