`msdm`: Models of Sequential Decision-Making

Goals

msdm aims to simplify the design and evaluation of models of sequential decision-making. The library can be used for cognitive science or computer science research/teaching.

Approach

msdm provides standardized interfaces and implementations for common constructs in sequential decision-making. This includes algorithms used in single-agent reinforcement learning as well as those used in planning, partially observable environments, and multi-agent games.

The library is organized around different problem classes and algorithms that operate on problem instances. We take inspiration from existing libraries such as scikit-learn that enable users to transparently mix and match components. For instance, a standard way to define a problem, solve it, and examine the results would be:

# create a problem instance
mdp = make_russell_norvig_grid(
    discount_rate=0.95,
    slip_prob=0.8,
)

# solve the problem
vi = ValueIteration()
res = vi.plan_on(mdp)

# print the value function
print(res.V)

The library is under active development. Currently, we support the following problem classes:

Markov Decision Processes (MDPs)
Partially Observable Markov Decision Processes (POMDPs)
Markov Games
Partially Observable Stochastic Games (POSGs)

The following algorithms have been implemented and tested:

Classical Planning
- Breadth-First Search (Zuse, 1945)
- A* (Hart, Nilsson & Raphael, 1968)
Stochastic Planning
- Value Iteration (Bellman, 1957)
- Policy Iteration (Howard, 1960)
- Labeled Real-time Dynamic Programming (Bonet & Geffner, 2003)
- LAO* (Hansen & Zilberstein, 2003)
Partially Observable Planning
- QMDP (Littman, Cassandra & Kaelbling, 1995)
- Point-based Value-Iteration (Pineau, Gordon & Thrun, 2003)
- Finite state controller gradient ascent (Meuleau, Kim, Kaelbling & Cassandra, 1999)
- Bounded finite state controller policy iteration (Poupart & Boutilier, 2003)
- Wrappers for POMDPs.jl solvers (requires Julia installation)
Reinforcement Learning
- Q-Learning (Watkins, 1992)
- Double Q-Learning (van Hasselt, 2010)
- SARSA (Rummery & Niranjan, 1994)
- Expected SARSA (van Seijen, van Hasselt, Whiteson & Wiering, 2009)
Multi-agent Reinforcement Learning (in progress)
- Correlated Q Learning (Greenwald & Hall, 2002)
- Nash Q Learning (Hu & Wellman, 2003)
- Friend/Foe Q Learning (Littman, 2001)

We aim to add implementations for other algorithms in the near future (e.g., inverse RL, deep learning, multi-agent learning and planning).

Installation

It is recommended to use a virtual environment.

Installing from pip

$ pip install msdm

Installing from GitHub

$ pip install --upgrade git+https://github.com/markkho/msdm.git

Installing the package in edit mode

After downloading, go into the folder and install the package locally (with a symlink so its updated as source file changes are made):

$ pip install -e .

Contributing

We welcome contributions in the form of implementations of algorithms for common problem classes that are well-documented in the literature. Please first post an issue and/or reach out to mark.ho.cs@gmail.com to check if a proposed contribution is within the scope of the library.

Running tests, etc.

To run all tests: make test

To run tests for some file: python -m py.test msdm/tests/$TEST_FILE_NAME.py

To lint the code: make lint

wasita/msdm

msdm: Models of Sequential Decision-Making