Framework to build Reinforcement Learning (RL) AI agents for different variations of the Super Smash games by Nintendo. The games run on different platforms, thus several different platforms are supported. Furthermore, the framework heavily uses Slippi which allows for 'offline' training of agents using replays from tournaments that have been played in the past.
A couple of pre-tuned RL algorithms are provided in the algorithms
package.
But it should be fairly easy to bring-your-own snowflake implementation of a RL
agent to play. Pointer on how to do that please see the smashrl
package.
Current platform+game
combinations supported:
Dolphin Emulator
+Super Smash Bros Brawl
Future platforms:
- Nintendo Switch
- Faster Melee
Future games:
- Super Smash Bros Melee
- Super Smash Bros Ultimate (switch)
Additional future features:
- Twitch play
- Multi-agent support
- Swappable algorithms (+ more algorithm implementations)
Requirements:
- Python3.6+
To install all required dependencies into your environment you can run:
$ make dep-install
Steps to generate training data: TBD
Run an emulator with bot: TBD
Rest of this section is TBD
Dependencies are handled using pip-tools
, which means they are listed without
version numbers in the requirements.in
file. This file is then 'compiled' which
collects and pins all required packages in a requirements.txt
file.
First we need to get pip-tools
, to install it run:
$ python3.6 -m pip install -r requirements-dev.txt
This command will in addition to installing pip-tools
install a couple of
handy tools and linters for Python. Please use them
Then we can run pip-tools
to generate requirements.txt
:
$ pip-compile requirements.in
To upgrade all dependencies run:
$ make dep-update
It plays! 🎆 🍺
TODO:
- Figure out why running off the side, negative rewards wrong? Figure this shit out
- Investigate menu states, read from Dolphin memory directly
- Recover from menu issues, game failures, bot wonkiness
- Train new agent with loooots of data
- Work out a good reward function
- Do proper Batching
- Test our DQN against known good environment - OpenAI (FrozenLake, HillRide etc)
- Minimize action space - lots of redudant, useless actions doing the same thing
- Configure headless environment - run in Docker
- Plot reward over time and other stats (avg, sum, etc)
- Train against highest level default bot online
- Add device config object to pass into each device class
$ make tests
TBD
TBD