/udacity_nd893_DRL_project_3

udacity nd893 Deep Reinforcement Learning - Project 3

Primary LanguageJupyter Notebook

udacity nd893 project 3

nd893 Deep Reinforcement Learning - Project 3 - Collaboration and Competition

Project Detail

Work on the Tennis environment and solve it using RLdeep learning based models for multi-agent continuous controls and actions.

In this environment, two agents control rackets to bounce a ball over a net. If an agent hits the ball over the net, it receives a reward of +0.1. If an agent lets a ball hit the ground or hits the ball out of bounds, it receives a reward of -0.01. Thus, the goal of each agent is to keep the ball in play.

The observation space consists of 8 variables corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation. Two continuous actions are available, corresponding to movement toward (or away from) the net, and jumping.

Trained Agent

Coding

I wrote this code on the online workspace @ udacity.

Still try to figure out how to build this environment on my machines.

Installation

(this is referred to online resource, not fully tested yet)

The environment can be downloaded from one of the links below for all operating systems:

Run Training

Run Tennis.ipynb for step-by-step details.

model.py contains neural network classes for Actor and Critic functional approximatior.

MADDPG_agent.py is the implementation of Multi Agent Deep Deterministic Policy Gradients (MADDPG) paper.

In this model every agent itself is modeled as a Deep Deterministic Policy Gradient (DDPG) agent paper

Report

Please refer to REPORT.md

Future Work

TBD, need more time