ltlhuuu/A2PR

Implementation of A2PR, a simple way to achieve SOTA in offline reinforcement learning with an adaptive advantage-guided policy regularization method, in Pytorch

PythonMIT

Readme
0Issues
20Stargazers
3Watchers

Stargazers

BlankSHC
HenryZhang-git
Facebear-ljx
Beijing
ZhengYinan-AIR
LANYIXING
TianciGao
Moscow
MaXiaoTianGitHub
epiphany-cc
ZitengHe
william-Dic
China/UK
LRMbbj
Beijing, People's Republic of China
t6-thu
Beijing
ZhangDaniel7171
zzzmmmmm111
atsby
kessmith
yongqianxiao
JACKROY4dd
lyn22333
meetyoulastmonth

Contact site admin: Geeks.