Kortex: A Chapel repository from londoed

A framework for distributed Deep Reinforcement Learning implemented in the Chapel Programming Language.

Note

This work is in pre-pre-alpha stage. Right now, I'm just fleshing out ideas to get to where I want it to be (which looks like the code in the Usage section). Much of the current code probably looks more like Python than Chapel...this is because I'm waiting for tools like Arkouda (Chapel's NumPy interface) to better implement linear algebra functionality, and am still becoming used to Chapel as a programming language.

If you have ideas on how to implement this library in a more efficient "Chapel" way, please don't hesitate to use my contact information below. Would love feedback and help with parallelizing the code when it comes time.

Thanks!

Installation

When completed, Kortex will be available through the Mason package manager.

Usage

Example of proper framework usage:

use Kortex;

// Agent
var model = new Regressor(),
    agent = new IMPALA(model, pi, env.info,
                   approx_params, batch_size,
                   n_approx=1, init_replay_size,
                   max_replay_size, target_update_freg); // Initializes new IMPALA Agent

// Algorithm
var alg = new Entity(agent, env); // Creates RL algorithm object
alg.fit(n_steps=init_replay_size,
          n_steps_per_fit=init_replay_size); // Implements RL algorithm
//...//
alg.evaluate(n_episodes=10, render=true); // Evaluates success/failure of algorithm

Eventually, Kortex will be able to be called through Python while still using Chapel's parallelism.

import kortex as kx
import tensorflow as tf

kx.chpl_setup()

# Agent #
model = kx.Regressor(tf.keras.models.ResNet50())
agent = kx.IMPALA(model, pi, env.info,
                   approx_params, batch_size,
                   n_approx=1, init_replay_size,
                   max_replay_size, target_update_freg)

# Algorithm #
alg = kx.Entity(agent, env)
alg.fit(n_steps=init_replay_size,
          n_steps_per_fit=init_replay_size)

#...#
alg.evaluate(n_episodes=10, render=True)

kx.chpl_cleanup()

TODO

Main Functionality:

Advanced Functionality:

Algorithms:

Contributing

Fork it (https://github.com/londoed/Kortex/fork)
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin my-new-feature)
Create a new Pull Request

Contributors

Eric D. Londo (londoed@comcast.net) - creator, maintainer

Resources & Learning

Reinforcement Learning

Books:

Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto
Algorithms for Reinforcement Learning by Csaba Szepesvari
Neuro-Dynamic Programming by Dimitri P. Betsekas and John N. Tsitsiklis
Decision Making Under Uncertainty: Theory and Application by Mykel J. Kochenderfer
Artificial Intelligence: Foundations of Computational Agents by David Poole and Alan Mackworth
Deep Reinforcement Learning Hands-On by Maxim Lapan
Python Reinforcement Learning by Sudharsan Ravichandiran, Sean Saito, Rajalingappaa Shanmugamani and Yang Wenzhuo
Grokking Deep Reinforcement Learning by Miguel Morales
Deep Reinforcement Learning in Action by Alexander Zai and Brandon Brown

Online Lectures:

Introduction to Reinforcement Learning by David Silver: Playlist
Advanced Deep Learning & Reinforcement Learning by DeepMind: Playlist
Move 37 Free Course by School of AI: Course

Open-Sourced RL Frameworks:

Dopamine by Google AI
Horizon by Facebook AI Research (FAIR)
TensorFlow Agents by Google Brain
TensorFlow Reinforcement Learning (TRFL) by DeepMind
Ray by RISE Lab at UC Berkeley
Huskarl by Daniel Salvadori
Mushroom by Carlo D'Eramo & David Tateo

Chapel

See official website:

Link
Docs
Presentations
Tutorials

Inspiration and Credit

This framework is inspired by original work done by DeepMind Technologies LTD and can be found in this paper.

In addition, work by Jeff Dean et al. in this paper on the DistBelief system as well as the TensorFlow paper by Google Brain.

The initial implementation of Kortex was heavily inspired by work done by Carlo D'Eramo and David Tateo in this repository.

The paper for their ideas can be found here.

londoed/Kortex