Sphero DQN Agent is a reinforcement learning (RL) agent that uses DQN to navigate a Sphero through an environment while optimizing for speed and minimizing impacts from collisions.
Sphero DQN Agent is implemented as a python command line script, sphero_dqn_agent.py
.
Since sphero_dqn_agent.py
is a single script file,
installation is as easy as cloning the git repo or downloading the file.
However, sphero_dqn_agent.py
does depend on a few libraries.
A requirements.txt
file is included for your convenience.
To install the required dependencies run
pip install -r requirements.txt
Note for Windows Users:
You may need to run this command instead of the one above.
py -m pip install -r requirements.txt
You may also need to install the Visual Studio Build Tools for C++ before running the above command.
Note for Windows Users:
You may need to replace python
with py
in the commands below.
Show help message:
python sphero_dqn_agent.py -h
Save model and configuration files to a specified directory:
python sphero_dqn_agent.py -p <dir>
Train the model for 100 episodes saving the model every 10 episodes. Script will automatically pick the latest model file and start training with that model.
python sphero_dqn_agent.py -t 100 -s 10 -p <dir>
Run 10 episodes using the model and configuration files at <dir>
.
python sphero_dqn_agent.py -r 10 -p <dir>
Note: Connections to the Sphero can take longer than you might normally expect.
Much of the agent, environment, and Sphero can be configured via JSON files. The easiest way to start configuring is to run
python sphero_dqn_agent.py -p <dir>
and look at the JSON files generated in <dir>
These are the set of hyperparams related to the DQN algorithm.
They are configured in hyperparams.json
.
discount_rate
- a.k.a gamma
epsilon
epsilon_min
epsilon_decay_rate
learning_rate
num_steps_per_episode
target_transfer_period
memory_buffer_size
These are the set of environment parameters and are configured in env_config.json
.
center_sphero_every_reset
max_steps_per_episode
stop_episode_at_collision
num_collisions_to_record
collision_penalty_multiplier
min_velocity_magnitude
low_velocity_penalty
velocity_reward_multiplier
These are the set of parameters that are used to configure the behavior of the Sphero
and are configured in sphero_config.json
.
min_collision_threshold
collision_dead_time
level_sphero
These are the set of params used to configure the bluetooth connection to the Sphero
and are configured in bluetooth_config.json
.
use_ble
sphero_search_name
You will need to modify the script file to change the structure of the nueral network/model.
Look for the function build_neural_network
.
The script will load a saved model from the model_<episode count>.h5
file with the largest <episode count>
.
<episode count>
is the number of episodes the model has been trained on thus far.
This makes it easy to share your custom model configurations with someone else
and have them keep training where you left off.
Results are saved to .csv files in the directory specified in the -p
or --path
option.
There are seperate files for results during training and results during non-training runs.
The following were referenced in the creation of this project.
Playing Atari with Deep Reinforcement Learning
keon.io: Deep Q-Learning with Keras and Gym