The main script is generalGDICE.py. Running it as is will run GDICE on the 4x3 maze POMDP environment with some default parameters. You can alter the main script as follows:
- Choose the environment in the first line. Any POMDP registered in gym-pomdps is accessible, though some (e.g., rocksample) may cause memory errors. Reference them by name (i.e., "POMDP-4x3-episodic-v0")
- NOTE: Rocksample is too big to fit on a system with 32 GB of memory...
- Create a controller distribution for the agent in the second line, specifying the number of nodes in the first argument.
- Define your parameters with a GDICEParams object in the 3rd line. In the constructor, you can specify:
- Number of nodes
- Number of iterations
- Number of controller samples per iteration
- Number of simultations for each controller to run on environment
- Number of best samples to update with in each iteration
- The learning rate of the controller distribution
- A value threshold which additionally filters out samples below a certain value. By default, this is off (None)
- Define a pool object in the 4th line if you want parallel processing.
I added a setup.py script, and I also took the step of building a source distribution.
To install, just install the dependencies (listed below), then install using the included tar.gz file. Scripts will automatically go to your /bin folder, and the rest of the package will be installed as other packages are.
- Python 3
- Numpy
- Gym
- Andrea's repositories, rl_parsers and gym_pomdps. These are expected to be pip installed with their included script
- gym_dpomdps (included in repo, "pip install gym_dpomdps/dist/gym_dpomdps-0.1.0.tar.gz"
Create a grid search function to automatically work through environments to determine the best G-DICE parametersExtend to DPOMDPs with a parser- Make the parallel approach more memory efficient
- Extend to continuous observation domains
- Apply to gym-minigrid