Author's Pytorch implementation of ICLR2023 paper Behavior Proximal Policy Optimization (BPPO).
Primary LanguagePythonMIT LicenseMIT