[DEPRECATED] Uses ancient version of PyTorch
This is a PyTorch implementation of Advantage Actor Critic (A2C), a synchronous deterministic version of A3C. Check out the OpenAI baselines blog post.
This implementation is a bare-bones reinterpretation of this one made by @ikostrikov. Our version uses only PyTorch and does not rely on the baselines package.
This is the original OpenAI baselines A2C written in TensorFlow.
Many thanks to @openai and @ikostrikov!