This repository provides an implementation of learning-based Monte-Carlo Tree Search variants in the Pommerman environment. Our approaches leverage opponent models (planning agents) to transform the multiplayer game into single- and two-player games depending on the provided settings.
The simplest way to get started and execute runs is to build a docker image and run it as a container.
Available backends:
- TensorRT (NVIDIA GPU required): Tested with TensorRT 8.0.1 and PyTorch 1.9.0.
To use NVIDIA GPUs in docker containers, you have to install docker and nvidia-docker2. Have a look at the installation guide https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html.
We provide small scripts to facilitate building the image and running experiments.
-
Build the image
$ bash docker/build.sh
This automatically caches the dependencies. If you run it again, only the code is rebuilt. If you want to rebuild the whole image, just call
bash docker/build.sh --no-cache
. -
Specify where you want to store the data generated by the experiments as environment variable
$POMMER_DATA_DIR
. You canexport POMMER_DATA_DIR=/some/dir
or just addPOMMER_DATA_DIR=/some/dir
as a prefix to the command in the following step. -
Create a container and run the training loop (replace
--help
with the arguments of your choice)$ bash docker/run.sh --help
- Note that
--dir
and--exec
are already specified correctly bydocker/run.sh
. - All GPUs are visible in the container and gpu 0 is used by default. You can specify the gpu to be used like
--gpu 4
.
- Note that
Of course, you can also build and run the image manually. Have a closer look at the scripts from the previous section for details.
Additional notes:
- You can limit the gpu access of a container like
--gpus device=4
. However, PommerLearn has a--gpu
argument that can be used instead. - Warning: If you use rootless docker, the container will probably run out of memory.
Adding
--ipc=host
or--shm-size=32g
to thedocker run
command helps. This is also done by default indocker/run.sh
.
-
Generate an SL dataset with 1 million samples with
$POMMER_EXEC --mode=ffa_sl --max-games=-1 --chunk-size=1000 --chunk-count=1000 --log --file-prefix=./1M_simple
where
$POMMER_EXEC
can be yourPommerLearn
executable orMODE=exec bash docker/run.sh
-
Train SL model: Run
pommerlearn/training/train_cnn.py
with the following modified arguments (see bottom of the file)"dataset_path": "1M_simple_0.zr", "test_size": 0.01, "output_dir": "./model-sl"
and save it as
$POMMER_DATA_DIR/model-sl
-
Generate a dummy model by running
pommerlearn/debug/create_dummy_model.py
and save it as$POMMER_DATA_DIR/model-dummy
-
You can now perform search experiments with both models. Use
POMMER_1VS1=false MODE=exec bash run.sh
for the single-player search andPOMMER_1VS1=true MODE=exec bash run.sh
git for the two-player search. -
To reproduce our results, you can generate 5 sl and dummy models labeled with the respective suffix
-0
to-4
. Navigate into the docker directory and run the search experiments with./docker $ bash search_experiments.sh
The results will be recorded in a single csv file.
Navigate into the docker directory and run the rl experiments with
./docker $ bash rl_experiments.sh
This will create a new directory in your working directory to store the training logs.
You will find the results in your $POMMER_DATA_DIR/archive
and the tensorboard runs in $POMMER_DATA_DIR/runs
.
To perform experiments in the team mode, you can collect samples with the option --mode=team_sl
and otherwise proceed like in the FFA mode, e.g.
```
$POMMER_EXEC --mode=team_sl --max-games=-1 --chunk-size=1000 --chunk-count=1000 --log --file-prefix=./1M_simple_team
```
You can then run pommerlearn/training/train_cnn.py
on the generated data set, this will automatically use the value targets for the team mode due to the meta information in the data set.
For the python side:
-
python 3.7
andpip
It is recommend to use virtual environments. This guide will use Anaconda. Create an environment named
pommer
with$ conda create -n pommer python=3.7
For the C++ side:
-
Essential build tools:
gcc
,make
,cmake
$ sudo apt install build-essential cmake
-
The dependencies z5, xtensor, boost and json by nlohmann can directly be installed with conda in the pommer environment:
(pommer) $ conda install -c conda-forge z5py xtensor boost nlohmann_json blosc
-
Blaze needs to be installed manually. Note that it can be unpacked anywhere, it does not have to be
/usr/local
. For further information, you can refer to the installation guide or the Dockerfiles in this repository.cmake -DCMAKE_INSTALL_PREFIX=/usr/local/ sudo make install export BLAZE_PATH=/usr/local/include/
-
Manual installation of TensorRT (not Torch-TensorRT), including CUDA and cuDNN. Please refer to the installation guide by NVIDIA https://developer.nvidia.com/tensorrt-getting-started.
This repository depends on submodules. Clone it and initialize all submodules with
$ git clone git@gitlab.com:jweil/PommerLearn.git && \
$ cd PommerLearn && \
$ git submodule update --init
-
The current version requires you to set the env variables
CONDA_ENV_PATH
: path of your conda environment (e.g.~/conda/envs/pommer
)BLAZE_PATH
: blaze installation path (e.g./usr/local/include
)CUDA_PATH
: cuda installation path (e.g./usr/local/cuda
)TENSORRT_PATH
(when using the CrazyAra TensorRT backend, e.g./usr/src/tensorrt
)- [
Torch_DIR
] (when using the CrazyAra Torch backend, currently untested)
-
Build the C++ environment with the provided
CMakeLists.txt
. To use TensorRT >= 8 (recommended), you have to specify-DUSE_TENSORRT8=ON
.
/PommerLearn/build $ cmake -DCMAKE_BUILD_TYPE=Release -DUSE_TENSORRT8=ON -DCMAKE_CXX_COMPILER="$(which g++)" ..
/PommerLearn/build $ make VERBOSE=1 all -j8
Optional: You can install PyTorch 1.9.0 with GPU support via
conda install -y pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=11.1 -c conda-forge -c pytorch
The remaining python runtime dependencies can be installed with
(pommer) $ pip install -r requirements.txt
Before starting the RL loop, you can check whether everything is set up correctly by creating a dummy model and loading it in the cpp executable:
(pommer) /PommerLearn/build $ python ../pommerlearn/debug/create_dummy_model.py
(pommer) /PommerLearn/build $ ./PommerLearn --mode=ffa_mcts --model=./model/onnx
You can then start training by running
(pommer) /PommerLearn/build $ python ../pommerlearn/training/rl_loop.py
Prerequisites and Building
- Make sure that you've pulled all submodules recursively
- In older versions of TensorRT, you have to manually comment out
using namespace sample;
indeps/CrazyAra/engine/src/nn/tensorrtapi.cpp
- We experienced issues with
std::filesystem
being undefined when using GCC 7.5.0. We recommend to update to more recent versions, e.g. GCC 11.2.0.
Running
- For runtime issues like
libstdc++.so.6: version 'GLIBCXX_3.4.30' not found
, try loading your system libraries withexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/
. On some systems, ctypes somehow uses a different libstdc++ from the conda environment instead of the correct lib path. As a last resort, you can back up the original librarymv /conda-lib-path/libstdc++.so.6 /conda-lib-path/libstdc++.so.6.old
and then create a symbolic linkln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /conda-lib-path/libstdc++.so.6
. - If you encounter errors like
ModuleNotFoundError: No module named 'training'
, set yourPYTHONPATH
to thepommerlearn
directory. For example,export PYTHONPATH=/PommerLearn/pommerlearn
. - When loading
tensorboard
runs, you can get errors likeError: tonic::transport::Error(Transport, hyper::Error(Accept, Os { code: 24, kind: Other, message: "Too many open files" }))
. The argument--load_fast=false
might help.
You can install the plotting utility for gprof: https://github.com/jrfonseca/gprof2dot
Activate the CMake option USE_PROFILING
in CMakeLists.txt
and rebuild.
Run the executable and generate the plot:
./PommerLearn --mode ffa_mcts --max_games 10
gprof PommerLearn | gprof2dot | dot -Tpng -o profile.png
If you find this repository helpful, please consider citing our paper
@inproceedings{weil2023knowYourEnemy,
author={Weil, Jannis and Czech, Johannes and Meuser, Tobias and Kersting, Kristian},
title={{Know your Enemy: Investigating Monte-Carlo Tree Search with Opponent Models in Pommerman}},
booktitle={Proceedings of the Adaptive and Learning Agents Workshop (ALA) at AAMAS 2023},
url={https://alaworkshop2023.github.io/},
year={2023}
}