Human-level control through deep reinforcement learning

Project for the course Capita Selecta Computer Science: Artificial Intelligence at KU Leuven

Components of the project

Main goal

A presentation on the topic "Human-level control through deep reinforcement learning". For our demo, we implemented an agent that is able to learn to play a variety of (simple) games using deep reinforcement learning.

Presentation (±30 min):

Intro to the problem this project tries to solve
A refresher on reinforcement learning
An introduction to convolutional neural networks
Deep reinforcement learning
Demo!
Curious Reinforcment Learning, a short section on what may come after deep reinforcement learning.

Implementation

The implementation of an agent that can successfully learn to play games can be found in src/main.py.

Neural network

Our implementation closely follows Deepmind's Deep Q-Network.
The deep neural network takes as input 4 video frames (grayscale with resolution 84x84) and returns probabilities for next actions to take. It consists of 3 convolutional layers and 2 fully connected layers with ReLUs in between. The network uses Experience Replay and a target network as described in Deepmind's DQN paper.

Virtual environment

For the agent to learn to play games, we needed a virtual environment.

Requirements

Can host a variety of games
Get observations from environment
Make actions in environment
Receive reward for action from environment
Tracks to drive on

The environment we ended up using OpenAI's Gym.
We originally started off with Udacity's Behavioral cloning project to create an agent for self driving cars. That path ended up being unfruitful, as we were not able to synchronise the data of the simulator with the separate neural network program correctly.
We then opted for a simpler approach in order to stay within the allocated time for this project. OpenAI's Gym library allowed us to write a working program in much less time. It also gave us the possibility to try out our network on a variety of different games.

Dependencies

This project relies on the following python dependencies:

numpy pytorch gym gym[atari]

Usage

The agent can be trained as follows

python3 main.py

The agent will start learning and will output the achieved score and save its network's weights every 10 episodes.

The agent can afterwards be loaded and be used to play the game with

python3 play.py

Built with

OpenAI Gym - Enviroments to interact with games
OpenAI Universe - Training and evaluating AI agents
PyTorch - Deep neural networks engine with GPU acceleration
Behavioral Cloning Project - Udacity's car simulation environment which allows for neural networks to learn to drive cars autonomously around tracks.

Authors

Thiery Deruyterre - ThierryDeruyttere
Armin Halilovic – arminnh

License

Distributed under the MIT license. See LICENSE for more information.

arminnh/ma2-csai-deep-reinforcement-learning