Welcome to 2AMC15 Data Intelligence Challenge!. This is the repository containing the challenge environment code.
- Create an agent which inherits from the
BaseAgent
class - Add the agents you want to test to
train.py
- There are 2 places to add you agent. Look for the comment
# Add your agent here
for where to add your agent.
- There are 2 places to add you agent. Look for the comment
- Run
$ python train.py grid_configs/rooms-1.grd --out results/
to start training!
train.py
is just an example training script.
Feel free to modify it as necessary.
In our basic example, we use command line arguments to select options for it.
This may not be convenient for you and you can choose to replace this training script with whatever you want.
By default, its usage is:
usage: train.py [-h] [--no_gui] [--sigma SIGMA] [--fps FPS] [--iter ITER]
[--random_seed RANDOM_SEED] [--out OUT]
GRID [GRID ...]
DIC Reinforcement Learning Trainer.
positional arguments:
GRID Paths to the grid file to use. There can be more than
one.
options:
-h, --help show this help message and exit
--no_gui Disables rendering to train faster
--sigma SIGMA Sigma value for the stochasticity of the environment.
--fps FPS Frames per second to render at. Only used if no_gui is
not set.
--iter ITER Number of iterations to go through.
--random_seed RANDOM_SEED
Random seed value for the environment.
--out OUT Where to save training results.
The code is made up of 3 modules:
agent
level_editor
world
The agent
module contains the BaseAgent
class as well as some benchmark agents to test against.
The BaseAgent
is an abstract class and all RL agents for DIC must inherit from/implement it.
If you know/understand class inheritence, skip the following section
Think of this like how all models in PyTorch start like
class NewModel(nn.Module):
def __init__(self):
super().__init__()
...
In this case, NewModel
inherits from nn.Module
, which gives it the ability to do back propagation, store parameters, etc. without you having to manually code that every time.
It also ensures that every class that inherits from nn.Module
contains at least the forward()
method, which allows a forward pass to actually happen.
In the case of your RL agent, inheriting from BaseAgent
guarantees that your agent implements process_reward()
and take_action()
.
This ensures that no matter what RL agent you make and however you code it, the environment and training code can always interact with it in the same way.
Check out the benchmark agents to see examples.
The level_editor
module contains a file called app.py
.
Run this file to make new levels.
$ python app.py
This will start up a web server where you can edit levels.
To view the level editor itself, go to 127.0.0.1:5000
.
All levels will be saved to the grid_configs/
directory.
Where the grids are saved can be changed in the file level_editor/__init__.py
, but this is not recommended.
We also provide a grid_generator.py
file to generate random grids, found in level_editor
directory.
Usage is:
$ cd level_editor
$ python grid_generator.py
usage: grid_generator.py [-h] N_GRIDS N_ROOMS FILE_PREFIX
Randomly generate grids.
positional arguments:
N_GRIDS Number of grids to generate.
N_ROOMS Number of rooms to generate in each grid.
FILE_PREFIX Prefix to give to the generated file name.
options:
-h, --help show this help message and exit
The world module contains:
environment.py
grid.py
gui.py
The environment is very important because it contains everything we hold dear, including ourselves 1. It is also the name of the class which our RL agent will act within.
The main interaction with Environment
is through the methods:
Environment()
to initialize the environmentget_observation()
to get an environment observation without taking a step or resetting the environment.reset()
to reset the environmentstep()
to actually take a time step with the environment.
Explanations for each of these methods and how to use them can be found in the examples in the environment.py
files and in the documentation in the code itself.
The Grid
class is the world on which the agents actually move.
It is essentially a fancy Numpy array with different methods to make things easier for us to work with.
The Graphical User Interface provides a way for you to actually see what the RL agent is doing. While performant and written using PyGame, it is still about 1300x slower than not running a GUI. Because of this, we recommend using it only while testing/debugging and not while training.
- python ~= 3.10
- numpy >= 1.24
- tqdm ~= 4
- pygame ~= 2.3
- flask ~= 2.2
- flask-socketio ~= 5.3
- pillow ~= 9.4
- colorcet ~=3.0
Footnotes
-
In case you missed it, this sentence is a joke. Please do not write all your code in the
Environment
class. ↩