/intrinsic-rewards

[WIP] A colletion of DRL algorithms with intrinsic rewards

Primary LanguagePythonApache License 2.0Apache-2.0

Intrinsic-Rewards

Actions Status

A collection of deep reinforcement learning algorithms with intrinsic rewards, based on Rainy and PyTorch.

Setup

First, install pipenv. E.g. you can install it via

pip install pipenv --user

Then you can create a virtual environment for isolated installing of related packages.

pipenv --site-packages --three install

Run

RND

With 32 parallel workers:

pipenv run experiments/rnd_atari.py --override='config.nworkers=32' train

With 64 parallel workers:

pipenv run experiments/rnd_atari.py train

With 128 parallel workers(needs horovod):

horovodrun -np 2 -H localhost:1,$other_host_name:1 pipenv run python experiments/rnd_atari.py train

Implemented Algorithms

Random Network Distillation

Results

Commit hash: aa4ebf0c3e9090d11fbd88a5de44aa2189f1d232

  • RND
    • 128 parallel enviroments, No MPI + CNN policy(NO LSTM)
    • All parameters are the same as the paper
  • PPO

Score

Venture

Montezuma's Revenge

Intrinsic rewards

RND

License

This project is licensed under Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0).