A Dueling Double Q-Network Implementation for solving RL environments in PyTorch
- The environment consists of agent where the task of the agent is to collect yellow bananas to increase the cummulative reward
- The current state of the environment is represented by 37 dimensional feature vector and contains the agent's velocity, along with ray-based perception of objects around the agent's forward direction.
- The agent can interact with the environment using 4 actions :
- 0 - move forward
- 1 - move backward
- 2 - turn left
- 3 - turn right
- Given this information, the agent has to learn how to best select actions
- A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana
- The task is episodic, and in order to solve the environment, the agent must get an average score of +13 over 100 consecutive episodes
- Python 3.6 :
- PyTorch (0.4,CUDA 9.0) : pip3 install torch torchvision
- ML-agents (0.4) : Refer to ml-agents for installation
- Numpy (1.14.5) : pip3 install numpy
- Matplotlib (3.0.2) : pip3 install matplotlib
- Jupyter notebook : pip3 install jupyter
- Download the environment from here and place it in the same folder as that of Navigation.ipynb file
- Deep Q - Network
- Double Deep Q - Network
- Dueling Deep Q - Network
step 1 : Install all the dependencies
step 2 : git clone https://github.com/adithya-subramanian/Dueling-Double-Q-Network.git
step 3 : jupyter notebook
step 4 : Run all cells in the Navigation.ipynb file
Certain parts of dqn_agent.py,model.py and navigation.ipynb has been partially taken from the Udacity's deep reinforcement learning Nanodegree.