/EMI

Implementation for ICML 2019 paper, EMI: Exploration with Mutual Information.

Primary LanguagePythonMIT LicenseMIT

EMI: Exploration with Mutual Information

In ICML 2019

Hyoungseok Kim* 1 2, Jaekyeom Kim* 1 2, Yeonwoo Jeong1 2, Sergey Levine3, Hyun Oh Song1 2

*: Equal contribution, 1: Seoul National University, Department of Computer Science and Engineering, 2: Neural Processing Research Center, 3: UC Berkeley, Department of Electrical Engineering and Computer Sciences

This codebase contains the source code for our paper, EMI: Exploration with Mutual Information.

Citing this work

Please cite if you find our work helpful to your research:

@inproceedings{kimICML19,
  Author    = {Hyoungseok Kim and Jaekyeom Kim and Yeonwoo Jeong and Sergey Levine and Hyun Oh Song},
  Title     = {EMI: Exploration with Mutual Information},
  Booktitle = {International Conference on Machine Learning (ICML)},
  Year      = {2019}}

Environment setup

Prerequisites

A non-virtual machine with the following components:

  • Ubuntu 16.04
  • CUDA 8.0
  • cuDNN 6.0
  • Conda

Setting up Conda environment

  • Run conda env create -f environment.yml.
  • After activating the created environment by executing conda activate rllab3, run pip install --upgrade https://github.com/Lasagne/Lasagne/archive/master.zip.

Setting up MuJoCo

  • Create a subdirectory, ./vendor/mujoco/.
  • Obtain a MuJoCo license for your machine by following the instructions from their website if you don't have one. They offer a number of licensing options including 30-day free trials.
  • Copy mjkey.txt, the license key file, into ./vendor/mujoco/.
  • Get the version 1.31 of the MuJoCo binaries for Linux from their website. Unzip the file.
  • Copy all the files inside the directory mjpro131/bin/ from the extracted content, into ./vendor/mujoco/.

Running experiments

  • Before running experiments, activate the conda environment by running conda activate rllab3.

  • To train an EMI agent on SwimmerGather, run:

    python examples/trpo_emi_mujoco.py
    
  • To train an EMI agent on SparseHalfCheetah, run:

    python examples/trpo_emi_mujoco.py --env=SparseHalfCheetah
    
  • To train an EMI agent on Montezuma's Revenge, run:

    python examples/trpo_emi_atari.py
    
  • The first run will end with no operations other than creating a config. Run the command again if you see the configuration message.

Acknowledgements

This work was partially supported by Samsung Advanced Institute of Technology and Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2019-0-01367, BabyMind).

License

MIT License