Plato Toolkit: Reinforcement Learning on Azure ML

Plato is a Python toolkit that enables users to train reinforcement learning (RL) agents at scale on Azure Machine Learning (AML) compute clusters using Ray RLLib. With AML, you can access powerful CPU or GPU enabled virtual machines to scale up your training to meet the computational speed and load requirements of your simulation environment and model architecture.

In addition to RL training and assessment on AML, Plato offers additional features and guidance for hyperparameter tuning via Ray Tune, experiment management with MLflow, curriculum learning, and Dockerized deployment of a trained agent.

Documentation

Glossary
Prerequisites
Examples:
- Get Started:
  - Getting Started on AML: A minimal working example of a Python simulation environment that can be connected to RLlib and used to train an agent on AML. You can think of it as a "Hello World" example.
  - Getting Started with an Anylogic Sim: Connect your AnyLogic simulation environment to RLlib using Baobab to run reinforcement learning experiments on AML.
- Experiment:
  - Hyperparameter Tuning and Monitoring: Learn how to tune, monitor, and download agents on AML.
  - Curriculum Learning: Explore how to gradually increases the difficulty or complexity of the task(s) that the agent has to solve, which can improve the learning efficiency and generalization.
  - Logging Trajectories: Find out how to log the trajectories of an agent with callbacks during training and evaluation, and access the logged data in your AML workspace.
- Assess and Deploy:
  - Custom Assessments: Collect logs for custom episode configurations evaluated on a trained agent.
  - Deploy Agent: Serve your agent locally or package and deploy it on Azure.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Prerequisites

Please ensure to install the development dependencies needed for this project in your virtual environment. You can do this by running the following command from the root of the repository:

pip install -r requirements/dev_requirements.txt

Coding Standards

In this repository, we use code linting software and require to run tests before a PR can be merged.

Each time a PR is opened, CI checks run to ensure that the code complies with PEP8, is type safe, is formatted according to black's conventions, and imports are correctly sorted. In addition, the CI pipeline runs unit tests to assess that the package build correctly and all features work as expected.

To be proactive and not discover that code cannot be merged only when opening a PR, we suggest to run the following software in your local computer:

flake8
mypy
isort
black
pytest

To launch these programs, simply go into the root of the repository and launch them. You can automatically lint your code by setting your IDE of choice. For example, to set VSCode to automatically lint the file you are editing with flake8 and mypy follow this tutorial. To run black in VSCode, you can follow this tutorial. A similar process has to be followed for isort.

Precommit Hooks

To get a smoother developer experience, please install pre-commit hooks in your environment:

pip install pre-commit
pre-commit install

Now every time you try to commit, the code is linted and issues fixed automatically when possible, otherwise the offending lines are shown on the screen.

Opening a PR

When opening a PR, we require code reviews before merging. To make code reviews easier, it is recommended that developers adhere to the following guidelines:

Commits should be self-contained and should change just one thing in the code
Commit messages should be clear and descriptive of the commit’s purpose
PR can contain more commits, but it is always better to have one PR for one commit
PR should have a clear description of why the change is made and why it is made in a particular way

Building the docs

Docs are built automatically using mkdocs whenever a change lands to main.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Azure/plato