In this Repo you will find my solution to 2nd Project of Deep Reinforcement Learning, Udacity Nanodegree called "Continuous Control".
I will be training the First Version that consist of a SINGLE agent.
In this environment, a double-jointed arm can move to target locations. A reward of +0.1 is provided for each step that the agent's hand is in the goal location. Thus, the goal of your agent is to maintain its position at the target location for as many time steps as possible.
The observation space consists of 33 variables corresponding to position, rotation, velocity, and angular velocities of the arm. Each action is a vector with four numbers, corresponding to torque applicable to two joints. Every entry in the action vector should be a number between -1 and 1.
The task is episodic, and in order to solve the environment, your agent must get an average score of +30 over 100 consecutive episodes.
To install and use the environment you have to follow these steps:
1 Install Anaconda Suite and create (and activate) a new python environment
conda create --name drlnd python=3.6
2 Install OpenAi gym
pip install gym
3 Install the Unity Environment App (not Unity SDK!)
Just copy your required os unity environment file on your working directory and decompress.
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
4 Clone this repository and install other dependencies:
pip install ./python
5 Run Continuous_Control_ok.ipynb (Jupyter Notebook) with the working code and watch how it trains a new model and later use it to play alone.
To train a new agent, open Continuous_Control_ok.ipynb Jupyter Notebook file.
Follow the steps described on each cell.
You will find a DDPG function that will train until np.mean(scores_deque)>=30.0
This means that the average score of the agent is greater than 30.
Then it will use saved agent in both files 'checkpoint_actor2.pth' and 'checkpoint_critic2.pth' to run a continuous episode that you can enjoy in realtime :-)