sjtu-marl/malib

How to use PPO to train in psro_scenario

Opened this issue · 1 comments

I can not find the implementation of PPO in this project.Through docs I know policy is compatible with Tianshou,but what about trainer?How can I use PPO to train in psro_scenario?I will appreciate it if you can answer my question.

@donotbelieveit PPO is not ready yet as further tests are required, but you can follow our submission to malib.rl.ppo (coming sooner). btw, you can refer to the given training example (here) for using RL subroutines in PSRO. And if you want to know the mechanism of RL trainer, please refer to this marl example: examples/run_gym.py. And also, please feel free to make your PR if you have any ideas to enrich our (MA)RL algorithm lib under malib/rl. :)