DeDOL: A Python repository from oguzhangur96

Implementation for the DeDOL algorithm proposed in 'Deep Reinforcement Learning for Green Security Games with Real-Time Information', AAAI 2019.

For more details of the algorithm, please refer to the paper Deep Reinforcement Learning for Green Security Games with Real-Time Information

Pre-requsite

Tensorflow GPU
cvxopt
nashpy

Basic Description

env.py: the GSG-I game model class
DeDOL.py: the main file for running the DeDOL algorithms
DeDOL_util.py: helper functions for DO.py
DeDOL_Global_Retrain.py: for loading the models trained in local modes, and then run more iterations in gloabl mode training
GUI_util.py: helper functions for showing the game using GUI
GUI.py: test the performance of trained DQNs using GUI.
maps.py: helper functions for generate different kinds of maps
patroller_cnn.py: the patroller CNN strategy representation
poacher_cnn.py: the poacher CNN strategy representation
patroller_rule.py: our designed heuristic parameterized random walk patroller
poacher_rule.py: our designed heuristic parameterized random walk patroller
patroller_randomsweeping.py: our desinged heuristic random sweeping patroller
replay_buffer.py: the replay buffer data structure needed for DQN training and prioterize experience replay
AC_patroller: the actor_critic patroller. Performs poor, not adopted in the DeDOL algorithm.

Most of the files include further detailed comments

How to run the DeDOL algorithm?

First run DeDOL.py for different local modes or pure global mode.
- The default training parameters should work well. You can also explore by yourself.
- To run in different local modes, change the 'po_location' parameter from 0 to 3, representing four different entering points. The code will automatically generate new directors saving DQN models trained in different local modes, for later loading in the DeDOL_Global_Retrain.py file.
- E.g. the command 'python DeDOL.py --row_num 5 --po_location 0 --map_type gauss' will run the DeDOL algorithm in a 5x5 grid, Mixture Gaussian Map, and the poacher will always enter the grid world from the left-top corner. The trained DQNs will be stored in the direct './Results_55_gauss_mode0/'.
- The training of DQNs could really be time-consuming in the convoluted GSG-I game. And several iterations of DeDOL would be requried to evolve a resonalbe strategy profile. Be patient :).
To collect the DQNs and run more DO iterations in global mode:
- You should first run DeDOL.py in all local modes.
- Run DeDOL_Global_retrain.py. Set the load_path parameter to be compatible with the save_path parameter you used in DeDOL.py to load the previous DQNs trained in local modes. The save_path parameter should omit the last number that specifying the mode, as it will auto collect all DQNs trained in all local loads. E.g if save_path is ./Results_33_random_mode0/ to ./Results_33_random_mode3/ , the load_path should be ./Results_33_random_mode.
To visualize the game process:
- run GUI.py with arg 'load' set False will visualize the behaviour of a parameterized poacher and a random sweeping patroller. You can change parameters like 'row_num', 'map_type', 'max_time' for fun.
- If you want to visualize the performance of trained DQNs, run GUI.py with arg 'load' set be True, and set the corresponding 'pa_load_path' and 'po_load_path' args to the path where you stored your DQN models.
- A pretrained patroller DQN against a heuristic parameterized poacher, and a pretrained poacher DQN against a randomsweeping patroller (in 7x7 grid world) is contained in the Pre-trained_Models diretory.

oguzhangur96/DeDOL

Implementation for the DeDOL algorithm proposed in 'Deep Reinforcement Learning for Green Security Games with Real-Time Information', AAAI 2019.

Pre-requsite

Basic Description

How to run the DeDOL algorithm?