Deep RL Quadcopter Controller

Teach a Quadcopter How to Fly!

An agent built using a reinforcement learning algorithm to train and fly a quadcopter. The project implements Deep Deterministic Policy Gradients (DDPG) algorithm.

Open and view the Project using the .zip file provided or at my Github Repository

The project is also hosted on GitHub

Getting Started
- Tools Required
- Installation
Running the App
Development
Stopping the App
References

Getting Started

The starter project can be downloaded from here

The project will be evaluated by a Udacity code reviewer according to the project rubric

Tools Required

You would require the following tools to develop and run the project:

Installation

Clone the repository and navigate to the downloaded folder.

git clone https://github.com/udacity/RL-Quadcopter-2.git
cd RL-Quadcopter-2

Create and activate a new environment.

conda create -n quadcop python=3.6 matplotlib numpy pandas
source activate quadcop

Create an IPython kernel for the quadcop environment.

python -m ipykernel install --user --name quadcop --display-name "quadcop"

Open the notebook.

jupyter notebook Quadcopter_Project.ipynb

Before running code, change the kernel to match the quadcop environment by using the drop-down menu (Kernel > Change kernel > quadcop). Then, follow the instructions in the notebook.
You will likely need to install more pip packages to complete this project. Please curate the list of packages needed to run your project in the requirements.txt file in the repository.

Running the App

To run the project:

Activate the Conda or Python virtual environment, and then start the jupyter notebook as mentioned above. Open your browser and visit localhost:8888 (or the port indicated in the terminal), and you should see all of the contents of the project in Quadcopter_Project.ipynb notebook
After completing the development, press the play ▶️ icon to start the execution of cells. The output will be visible right below each respective cells.

Development

The notebook contains the following functions and configurations:

Replay Buffer to store and recall experience tuples
DDPG: Actor (Policy) Model for the copter which is meant to map states to actions.
loss function using action_gradients and action
DDPG: Critic (Value) Model is meant to map (state, action) pairs to their Q-values
The final output of this model is the Q-value for any given (state, action) pair. However, we also need to compute the gradient of this Q-value with respect to the corresponding action vector, needed for training the actor model.
DDPG: Agent is build by putting together the actor and policy models. Note that we will need two copies of each model - one local and one target
Noise Model uses the Ornstein–Uhlenbeck Noise process

Follow the instructions in the notebook; they will lead you through the project. You'll be editing the Quadcopter_Project.ipynb file.

Stopping the App

Once you're done with the app, stop it gracefully using the following command:

Select File -> Close and Halt inside jupyter notebook
Press Ctrl+c in the cli
Deactivate conda environment using the following command:
```
>> conda deactivate generate-scripts
```
Delete the environment if done with the project]
```
>> conda remove --name generate-scripts --all
```

References

Continuous Control with Deep Reinforcement Learning

madhur-taneja/RL-Quadcopter-2