Pinned Repositories
Q-Learning-LFA
Chen, Z., Zhang, S., Doan, T. T., Clarke, J. P., & Maguluri, S. T. (2019). Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning.
Average-Reward-TD-Q-Learning
Code for the numerical experiments in Zhang, Sheng, Zhe Zhang, and Siva Theja Maguluri. "Finite Sample Analysis of Average-Reward TD Learning and Q-Learning."
DouZero
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI
szhangml.github.io
szhangml's Repositories
szhangml/Average-Reward-TD-Q-Learning
Code for the numerical experiments in Zhang, Sheng, Zhe Zhang, and Siva Theja Maguluri. "Finite Sample Analysis of Average-Reward TD Learning and Q-Learning."
szhangml/DouZero
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI
szhangml/szhangml.github.io