/G-DICE

Stores G-DICE code as I convert it

Primary LanguageMATLAB

G-DICE

Basic Usage

The main script is generalGDICE.py. Running it as is will run GDICE on the 4x3 maze POMDP environment with some default parameters. You can alter the main script as follows:

  1. Choose the environment in the first line. Any POMDP registered in gym-pomdps is accessible, though some (e.g., rocksample) may cause memory errors. Reference them by name (i.e., "POMDP-4x3-episodic-v0")
    • NOTE: Rocksample is too big to fit on a system with 32 GB of memory...
  2. Create a controller distribution for the agent in the second line, specifying the number of nodes in the first argument.
  3. Define your parameters with a GDICEParams object in the 3rd line. In the constructor, you can specify:
    1. Number of nodes
    2. Number of iterations
    3. Number of controller samples per iteration
    4. Number of simultations for each controller to run on environment
    5. Number of best samples to update with in each iteration
    6. The learning rate of the controller distribution
    7. A value threshold which additionally filters out samples below a certain value. By default, this is off (None)
  4. Define a pool object in the 4th line if you want parallel processing.

Install

I added a setup.py script, and I also took the step of building a source distribution.

To install, just install the dependencies (listed below), then install using the included tar.gz file. Scripts will automatically go to your /bin folder, and the rest of the package will be installed as other packages are.

Dependencies

Future work

  • Create a grid search function to automatically work through environments to determine the best G-DICE parameters
  • Extend to DPOMDPs with a parser
  • Make the parallel approach more memory efficient
  • Extend to continuous observation domains
  • Apply to gym-minigrid