/off-policy-continuous-control

Official PyTorch code for "Recurrent Off-policy Baselines for Memory-based Continuous Control" (DeepRL Workshop, NeurIPS 21)

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Recurrent Off-policy Baselines for Memory-based Continuous Control

This repo is the official codebase of our following paper:

@article{yang2021recurrent,
  title={Recurrent Off-policy Baselines for Memory-based Continuous Control},
  author={Yang, Zhihan and Nguyen, Hai},
  journal={Deep RL Workshop, NeurIPS 2021},
  year={2021}
}

Paper summary: We implement and benchmark recurrent versions of DDPG, TD3 and SAC that uses full history.

This repo offers:

  • DDPG, TD3 and SAC (clean PyTorch implementation and benchmarked against stable-baselines3*)
  • Recurrent versions of DDPG, TD3 and SAC that use full history: RDPG, RTD3 and RSAC
  • Very easy to understand and use; see our exhaustive documentation: link

*The results of benchmarking can be found in issue "Performance check against SB3" in closed Issues.

For users:

  • Please feel free to ask a code question through Issues.
  • When cloning this repo, please consider using shallow clone as it is large due to a large number of commits.

News:

  • [2021/11/18] Noticed that I forgot to document dependencies in documentation. Added.

Paper link:

https://arxiv.org/pdf/2110.12628.pdf

Poster (click to open in new tab for better resolution):

poster

Star history:

Star History Chart