msdm
aims to simplify the design and evaluation of
models of sequential decision-making. The library
can be used for cognitive science or computer
science research/teaching.
msdm
provides standardized interfaces and implementations
for common constructs in sequential
decision-making. This includes algorithms used in single-agent
reinforcement learning
as well as those used in
planning,
partially observable environments,
and multi-agent games.
The library is organized around different problem classes and algorithms that operate on problem instances. We take inspiration from existing libraries such as scikit-learn that enable users to transparently mix and match components. For instance, a standard way to define a problem, solve it, and examine the results would be:
# create a problem instance
mdp = make_russell_norvig_grid(
discount_rate=0.95,
slip_prob=0.8,
)
# solve the problem
vi = ValueIteration()
res = vi.plan_on(mdp)
# print the value function
print(res.V)
The library is under active development. Currently, we support the following problem classes:
- Markov Decision Processes (MDPs)
- Partially Observable Markov Decision Processes (POMDPs)
- Markov Games
- Partially Observable Stochastic Games (POSGs)
The following algorithms have been implemented and tested:
- Classical Planning
- Breadth-First Search (Zuse, 1945)
- A* (Hart, Nilsson & Raphael, 1968)
- Stochastic Planning
- Value Iteration (Bellman, 1957)
- Policy Iteration (Howard, 1960)
- Labeled Real-time Dynamic Programming (Bonet & Geffner, 2003)
- LAO* (Hansen & Zilberstein, 2003)
- Partially Observable Planning
- QMDP (Littman, Cassandra & Kaelbling, 1995)
- Point-based Value-Iteration (Pineau, Gordon & Thrun, 2003)
- Finite state controller gradient ascent (Meuleau, Kim, Kaelbling & Cassandra, 1999)
- Bounded finite state controller policy iteration (Poupart & Boutilier, 2003)
- Wrappers for POMDPs.jl solvers (requires Julia installation)
- Reinforcement Learning
- Q-Learning (Watkins, 1992)
- Double Q-Learning (van Hasselt, 2010)
- SARSA (Rummery & Niranjan, 1994)
- Expected SARSA (van Seijen, van Hasselt, Whiteson & Wiering, 2009)
- Multi-agent Reinforcement Learning (in progress)
- Correlated Q Learning (Greenwald & Hall, 2002)
- Nash Q Learning (Hu & Wellman, 2003)
- Friend/Foe Q Learning (Littman, 2001)
We aim to add implementations for other algorithms in the near future (e.g., inverse RL, deep learning, multi-agent learning and planning).
It is recommended to use a virtual environment.
$ pip install msdm
$ pip install --upgrade git+https://github.com/markkho/msdm.git
After downloading, go into the folder and install the package locally (with a symlink so its updated as source file changes are made):
$ pip install -e .
We welcome contributions in the form of implementations of algorithms for common problem classes that are well-documented in the literature. Please first post an issue and/or reach out to mark.ho.cs@gmail.com to check if a proposed contribution is within the scope of the library.
To run all tests: make test
To run tests for some file: python -m py.test msdm/tests/$TEST_FILE_NAME.py
To lint the code: make lint