AutoCFR: Learning to Design Counterfactual Regret Minimization Algorithms
Hang Xu* , Kai Li*, Haobo Fu, Qiang Fu, Junliang Xing#
AAAI 2022 (Oral)
sudo apt install graphviz xdg-utils xvfb
conda create -n AutoCFR python==3.7.10
conda activate AutoCFR
pip install -e .
pytest tests
To easily run the code for training, we provide a unified interface. Each experiment will generate an experiment id and create a unique directory in logs
. We use games implemented by OpenSpiel [1].
python scripts/train.py
You can modify the configuration in the following ways:
- Modify the operations. Some codes are from Meta-learning curiosity algorithms[2]. Specify your operations in
autocfr/program/operations.py
and specify a list of operations to use inautocfr/generator/mutate.py
. - Modify the type and number of games used for training. Specify your game in
autocfr/utils.py:load_game_configs
. - By default, we learn from bootstrapping. If you want to learn from scratch, Set
init_algorithms_file
to["models/algorithms/empty.pkl]
inscripts/train.py
. - Modify the hyperparameters. Edit the file
scripts/train.py
. - Train on distributed servers. Follow the instructions of ray to setup your private cluster and set
ray.init(address="auto")
inscripts/train.py
.
You can use Tensorboard to monitor the training process.
tensorboard --logdir=logs
Run the following script to test algorithms learned by AutoCFR. By default, we will test the algorithm with the highest score. logid
is the generated unique experiment id. The results are saved in the folder models/games
.
python scripts/test_learned_algorithm.py --logid={experiment id}
Run the following script to test learned algorithms in Paper, i.e., DCFR+, AutoCFR4, and AutoCFR8. The results are saved in the folder models/games
.
python scripts/test_learned_algorithm_in_paper.py
We use PokerRL [3] to test learned algorithms in HUNL Subgames.
cd PokerRL
pip install -e .
tar -zxvf texas_lookup.tar.gz
Run the following script to test learned algorithms in Paper, i.e., DCFR+, AutoCFR4, and AutoCFRS. The results are saved in the folder PokerRL/models/
.
python PokerRL/scripts/run_cfr.py --iters 20000 --game subgame3 --algo=DCFRPlus
python PokerRL/scripts/run_cfr.py --iters 20000 --game subgame3 --algo=AutoCFR4
python PokerRL/scripts/run_cfr.py --iters 20000 --game subgame3 --algo=AutoCFRS
If you use AutoCFR in your research, you can cite it as follows:
@inproceedings{AutoCFR,
title = {AutoCFR: Learning to Design Counterfactual Regret Minimization Algorithms},
author = {Hang, Xu and Kai, Li and Haobo, Fu and Qiang, Fu and Junliang, Xing},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2022},
pages = {5244--5251}
}
[1] Lanctot, M.; Lockhart, E.; Lespiau, J.-B.; Zambaldi, V.; Upadhyay, S.; P´erolat, J.; Srinivasan, S.; Timbers, F.; Tuyls, K.; Omidshafiei, S.; Hennes, D.; Morrill, D.; Muller, P.; Ewalds, T.; Faulkner, R.; Kram´ar, J.; Vylder, B. D.; Saeta, B.; Bradbury, J.; Ding, D.; Borgeaud, S.; Lai, M.; Schrittwieser, J.; Anthony, T.; Hughes, E.; Danihelka, I.; and Ryan-Davis, J. 2019. OpenSpiel: A Framework for Reinforcement Learning in Games. CoRR, abs/1908.09453.
[2] Alet, F.; Schneider, M. F.; Lozano-Perez, T.; and Kaelbling, L. P. 2019. Meta-learning curiosity algorithms. In International Conference on Learning Representations, 1–21.
[3] Steinberger, E. 2019. PokerRL. https://github.com/TinkeringCode/PokerRL.