Deep Reinforcement Learning in Large Discrete Action Spaces

This is a PyTorch implementation of the paper "Deep Reinforcement Learning in Large Discrete Action Spaces" (Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin).

Installation

To install the relevant libraries, run the following command:

pip install -r requirements.txt

Demonstration of Model

Demonstration video:

Results for `k = 1` (K is the number of nearest neighbours)	Results for `k = 10` (K is the number of nearest neighbours)

Train the agent

To train the agent, simply run the main.ipynb file provided in the repository. The parameters can be updated by changing the values in the Arguments class.

Test the agent

After training the agent using the above code, run the following code to test it on the cartpole environment.

import gym
from gym import wrappers
env_to_wrap = ContinuousCartPoleEnv()
env = wrappers.Monitor(env_to_wrap, './demo', force = True)
env.reset()
for i_episode in range(1):
    observation = env.reset()
    ep_reward = 0
    for t in range(500):
        env.render()
        action = agent.select_action(observation)
        observation, reward, done, info = env.step(action)
        ep_reward += reward
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
    print(ep_reward)
env_to_wrap.close()
env.close()

Acknowledgements

Our DDPG code is based on the excellent implementation provided by ghliu/pytorch-ddpg.
The WOLPERTINGER agent code and action_space.py code is based on the excellent implementation of the paper provided by jimkon/Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces

Reference

If you are interested in the work and want to cite it, please acknowledge the following paper:

@article{DBLP:journals/corr/Dulac-ArnoldESC15,
  author    = {Gabriel Dulac{-}Arnold and
               Richard Evans and
               Peter Sunehag and
               Ben Coppin},
  title     = {Reinforcement Learning in Large Discrete Action Spaces},
  journal   = {CoRR},
  volume    = {abs/1512.07679},
  year      = {2015},
  url       = {http://arxiv.org/abs/1512.07679},
  archivePrefix = {arXiv},
  eprint    = {1512.07679},
  timestamp = {Mon, 13 Aug 2018 16:46:25 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/Dulac-ArnoldESC15},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

nikhil3456/Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces