andrewjong/Deep-Learning-Paper-Surveys

[PPO] Proximal Policy Optimization Algorithms (Month Year Conference)

Opened this issue 4 years ago · 1 comments

andrewjong commented 4 years ago

0. Article Information and Links

Paper's project website: https://openai.com/blog/openai-baselines-ppo/
Release date: YYYY/MM/DD
Number of citations (as of 2020/MM/DD):

1. What do the authors try to accomplish?

Vanilla Policy Gradients are unstable.
TRPO is complicated.
Make something stable and simple

2. What's great compared to previous research?

3. Where are the key elements of the technology and method?

4. How do the authors measure success?

5. How did you verify that it works?

6. Things to discuss? (e.g. weaknesses, potential for future work, relation to other work)

7. Are there any papers to read next?

8. References

andrewjong commented 4 years ago

We covered this paper in this video