/generalized_advantage_estimation

Course Presentation for STA4273 2021 Winter, University of Toronto

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

generalized_advantage_estimation

Course Presentation for STA4273 2021 Winter, University of Toronto

Our Colab version of Jupyter Notebook: https://colab.research.google.com/drive/11LTZ7tVR_IW4siDoK6qxWGBwn3Hp0zl9?usp=sharing

Link to the Course Page: https://www.cs.toronto.edu/~cmaddis/courses/sta4273_w21/

Link to the Course Page of the Presentation: https://www.cs.toronto.edu/~cmaddis/courses/sta4273_w21/studentwork/gae.pdf

If you have any questions, please contact zhibozhang@cs.toronto.edu

Acknowledgement: Thanks to Prof. Chris Maddison for the discussion and guidance.

References:

[1] Schulman, J., Moritz, P., Levine, S., Jordan, M., & Abbeel, P. (2015). High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438.

[2] https://colab.research.google.com/drive/1Wb_2zKgAqhI2tVK19Y1QC8AHImrzlcme?usp=sharing