/ma2-csai-deep-reinforcement-learning

Creating an agent that can learn to play different games with only pixels as input

Primary LanguageC#MIT LicenseMIT

Human-level control through deep reinforcement learning

Project for the course Capita Selecta Computer Science: Artificial Intelligence at KU Leuven

Components of the project

Main goal

A presentation on the topic "Human-level control through deep reinforcement learning". For our demo, we implemented an agent that is able to learn to play a variety of (simple) games using deep reinforcement learning.

Presentation (±30 min):

  1. Intro to the problem this project tries to solve
  2. A refresher on reinforcement learning
  3. An introduction to convolutional neural networks
  4. Deep reinforcement learning
  5. Demo!
  6. Curious Reinforcment Learning, a short section on what may come after deep reinforcement learning.

Implementation

The implementation of an agent that can successfully learn to play games can be found in src/main.py.

Neural network

Our implementation closely follows Deepmind's Deep Q-Network.
The deep neural network takes as input 4 video frames (grayscale with resolution 84x84) and returns probabilities for next actions to take. It consists of 3 convolutional layers and 2 fully connected layers with ReLUs in between. The network uses Experience Replay and a target network as described in Deepmind's DQN paper.

Virtual environment

For the agent to learn to play games, we needed a virtual environment.

Requirements
  • Can host a variety of games
  • Get observations from environment
  • Make actions in environment
  • Receive reward for action from environment
  • Tracks to drive on

The environment we ended up using OpenAI's Gym.
We originally started off with Udacity's Behavioral cloning project to create an agent for self driving cars. That path ended up being unfruitful, as we were not able to synchronise the data of the simulator with the separate neural network program correctly.
We then opted for a simpler approach in order to stay within the allocated time for this project. OpenAI's Gym library allowed us to write a working program in much less time. It also gave us the possibility to try out our network on a variety of different games.

Dependencies

This project relies on the following python dependencies:

numpy pytorch gym gym[atari]

Usage

The agent can be trained as follows

python3 main.py

The agent will start learning and will output the achieved score and save its network's weights every 10 episodes.

The agent can afterwards be loaded and be used to play the game with

python3 play.py

Built with

  • OpenAI Gym - Enviroments to interact with games
  • OpenAI Universe - Training and evaluating AI agents
  • PyTorch - Deep neural networks engine with GPU acceleration
  • Behavioral Cloning Project - Udacity's car simulation environment which allows for neural networks to learn to drive cars autonomously around tracks.

Authors

License

Distributed under the MIT license. See LICENSE for more information.

References

Papers

Useful links