
An AI trained using the DQN algorithm to play Pacman

Anna is an AI trained using the DQN (Deep Q-Network) algorithm to play Pacman.


python 3.5





GPU Training: cuda


To start training, run dqn.py with -m Train To test the latest agent, use -m Test The -iw flag enables user interactions, these interactions are also recorded as experiences while training. The -v flag enables verbose reporting.


python3 dqn.py -m Train -v

Code Structure


This file checks the environment functinoality. First a check is performed on Keras to verify it exists and retrieve the backend in use. Then, that backend is tested using a basic function and whether the CPU or GPU is being used is reported.

python3 env_test.py
## Checking Keras

Using TensorFlow backend.

## Checking TensorFlow

Version 1.5.0

TensorFlow is using the CPU


This is the DQN implementation.

Important Paramters

FRAMEWORK = 'tf'            # tf/theano
MODEL_NAME = 'Anna'         # An AI has to have a name! Also the subdirectory name
MODEL_VERSION = 1           # The version is used for the file name
VERSION_UPDATE = 1200       # New version interval, in seconds
SAVE_INTERVAL = 1000        # Save interval in iterations
# It is recommended to set this higher on non-SSD hard disks as dumping is expensive
DUMPING_INTERVAL = 5000     # Experience memory dumping interval in iterations
REPORT_INTERVAL = 1000      # Frames before reporting, if not verbose
THROTTLING_PERIOD = 2       # Frames to skip before sampling the experience log and training again
ACTIONS = 6                 # Number of valid actions
INITIAL_GAMMA = 0.6         # Low confidence in predictions while exploring
# A large gamma for a game where positive rewards are usually closely-packed can eventually cause an overflow
FINAL_GAMMA = 0.8           # High confidence when perfecting the technique
OBSERVE_PERIOD = 5000       # Frames to observe before training
EXPLORE_PERIOD = 1000000    # Iterations over which to anneal EPSILON from initial to final
INITIAL_EPSILON = 0.1       # Starting value of EPSILON
FINAL_EPSILON = 0.001       # Final value of EPSILON
MEMORY_SIZE = 50000         # Number of previous transitions to remember
BATCH = 32                  # Size of experiences to train on
FRAMES_PER_ACTION = 1       # The delay before taking another action (set to 1 for no delay)
LEARNING_RATE = 1e-4        # Our network's learning rate
FRAMES_PER_SAMPLE = 4       # How many frames to stack per sample, good for detecting time-based amounts like velocity
RESTORE_STATE = True        # Whether or not to restore a state if found
RESTORE_MEMORY = True       # Whether or not to restore the experience memory if found
# The image size, images are rotated 90 degrees in a matrix so the height is rows and the width is columns
# It's better to use a square image because, otherwise, lines might get jagged or completely disappear while resizing
IMG_ROWS , IMG_COLS = 132, 132

Important functions:


This function builds our Keras model.


This function saves our model (the design and weights) along with other relevant progress information like the current iteration, the model version and our epsilon/gamma paramters.


This function dumps the experience memory (states, actions, rewards) to a file.


Loads the previously learned weights for our model.


Checks for a saved state, if found loads it.


Checks for a dumped memory, if found loads it.


Starts exercising the network using the game in train/test mode.


This project was inspired by the following Keras-FlappyBird project. Initially I borrowed the same model but later modified it to better suite Pacman. You can read more about it here: https://yanpanlau.github.io/2016/07/10/FlappyBird-Keras.html

I do not own the game implementation used, you can find it in the following repo: https://github.com/greyblue9/pacman-python