subaochen/subaochen.github.io

policy improvement的数学证明

Opened this issue 6 years ago · 0 comments

subaochen commented 6 years ago

https://subaochen.github.io/reinforcement%20learning/2019/08/21/policy-improvement-math-prove/