Platform: Linux
Python: 3.8
Create environment:
conda env create -f environment.yml
Find out the gym-retro game folder:
import os
import retro
retro_directory = os.path.dirname(retro.__file__)
game_dir = "data/stable/StreetFighterIISpecialChampionEdition-Genesis"
print(os.path.join(retro_directory, game_dir))
Add state files in data/sf
and ROM file into the game folder.
Disclaimer: We are unable to provide you with any game ROMs. It is the users own legal responsibility to acquire a game ROM for emulation. This library should only be used for non-commercial research purposes.
Environment is specified in main/common/retro_wrappers.py
. It tracks the inner states of the game, and is compatible with Gym interface and popular RL packages such as stable-baselines.
Algorithms is implemented in main/common/algorithms.py
and main/common/league.py
. Specifically, IPPO
in algorithms.py
implements IPPO and 2Timescale methods, and League, PSRO, and FSP is implemented in league.py
. We use PPO in stable-baselines as the backbone algorithm for all these implementations. The League implementation adapts the pseudocode in main/common/pseudocode
, which is from previous work AlphaStar.
RL against built-in CPU player:
python train.py --reset=round \
--state=stars/Champion.Level1.RyuVsRyu.${side}_star${state} \ # difficulty level
--side=${side} \ # left/right
--model-name-prefix=ppo_ryu_${side}_star${state} \
--save-dir=trained_models/ppo_ryu_${side}_star${state} \
--log-dir=logs/ppo_ryu_${side}_star${state} \
--video-dir=videos/ppo_ryu_${side}_star${state} \
--num-epoch=50 \
--enable-combo --null-combo --transform-action
RL with curriculum learning:
python finetune.py --reset=round \
--model-name-prefix=ppo_ryu_finetune \
--save-dir=trained_models/ppo_ryu_finetune \
--log-dir=logs/ppo_ryu_finetune \
--video-dir=videos/ppo_ryu_finetune \
--finetune-dir=finetune/ppo_ryu_finetune \
--num-epoch=25
IPPO / 2Timescale:
python ippo.py --reset=${task} \
--model-name-prefix=ippo_ryu_2p_scale_${scale}_${seed} \
--save-dir=trained_models/ippo_ryu_2p_scale_${scale}_${seed} \
--log-dir=logs/ippo_ryu_2p_scale_${scale}_${seed} \
--video-dir=videos/ippo_ryu_2p_scale_${scale}_${seed} \
--finetune-dir=finetune/ippo_ryu_2p_scale_${scale}_${seed} \
--num-epoch=50 \
--enable-combo --null-combo --transform-action \
--other-timescale=${scale} \ # scale=1 equivalent to IPPO
--seed=${seed} \
League / PSRO / FSP:
python train_ma.py --reset=round \
--save-dir=trained_models/ma \
--log-dir=logs/ma \
--left-model-file=trained_models/ppo_ryu_left_star8/ppo_ryu_left_star8_final_steps \
--right-model-file=trained_models/ppo_ryu_right_star8/ppo_ryu_right_star8_final_steps \
--enable-combo --null-combo --transform-action \
--seed=${seed}
# --psro-league for PSRO, --fsp-league for FSP
Single-Agent RL Exploiters:
python best_response.py --reset=round \
--model-name-prefix=br_${model}/seed_${seed} \
--save-dir=trained_models/ma_br/${model}/seed_${seed} \
--log-dir=logs/ma_br/${model}/seed_${seed} \
--video-dir=videos/ma_br/${model}/seed_${seed} \
--finetune-dir=finetune/ma_br/${model}/seed_${seed} \
--model-file=/path/to/model \ # --model-file is for 2P policies, also support load left and right 1P policies seperately, by --left-model-file and --right-model-file
--num-epoch=50 \
--enable-combo --null-combo --transform-action \
--update-right=0 \ # exploit the right policy, then do not update it
--seed=${seed}
Play with trained policies:
python play_with_ai.py # change the model path in play_with_ai.py, the key mapping is in common/interactive.py
Stay tuned for supports on more fighting games! You could also integrate your own games via implementing a wrapper environment similar in main/common/retro_wrappers.py
.
If you find our repo useful, please consider cite our work:
@inproceedings{lifightladder,
title={FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning},
author={Li, Wenzhe and Ding, Zihan and Karten, Seth and Jin, Chi},
booktitle={Forty-first International Conference on Machine Learning}
}