Semi-Markov Afterstate Actor-Critic (SMAAC) with Maze

The "Learning to run a power network" (L2RPN) challenge is a series of competitions proposed by Kelly at al. (2020) with the aim to test the potential of reinforcement learning to control electrical power transmission. The challenge is motivated by the fact that existing methods are not adequate for real-time network operations on short temporal horizons in a reasonable compute time. Also, power networks are facing a steadily growing share of renewable energy, requiring faster responses. This raises the need for highly robust and adaptive power grid controllers. In 2020, one such competition was run at the IEEE World Congress on Computational Intelligence (WCCI) 2020. The winners have published their novel approach of combining a Semi-MDP with an after-state representation at ICLR 2021 and made their implementation publicly available. The latest iteration of the L2RPN challenge poses a welcome opportunity to introduce our RL framework Maze and to replicate the winning approach with it.

This poses a welcome opportunity to introduce our RL framework Maze by replicating the SMAAC approach with it. This repository contains all necessary code and instructions for this. For a more extensive wrap up you can also check out our accompanying blog post.

Maze

Maze is an application-oriented deep reinforcement learning (RL) framework, addressing real-world decision problems. Our vision is to cover the complete development life-cycle of RL applications, ranging from simulation engineering to agent development, training and deployment.

If you encounter a bug, miss a feature or have a question that the documentation doesn't answer: We are happy to assist you! Report an issue or start a discussion on GitHub or StackOverflow.

Installation

As Conda environment

Install all dependencies:

conda env create -f environment.yml
conda activate maze_smaac

Install lightsim2grid, a fast backend for Grid2Op:

chmod +x install_lightsim2grid.sh
./install_lightsim2grid.sh

Optional: Install this repository to include it in your Python path with

pip install -e .

As Docker image

Execute

docker buildx build -t enliteai/maze_smaac --build-arg MAZE_CORE_ENV=enliteai/maze:latest -f docker/maze_smaac.dockerfile .

to locally build a Docker image. Alternatively pull it with:

docker pull enliteai/maze_smaac:latest /bin/bash

Start a container with:

docker run -it enliteai/maze_smaac:latest /bin/bash

Data Download

You need the chronics data from the official SMAAC repo. The training script downloads and unpacks the required data automatically if it's not available in maze_smaac/data. The data download can also be started explicitly with

python scripts/data_acquisition.py

Alternatively you can manually download the data from here. The extracted data folder should replace maze_smaac/data.

Test Installation

NOTE: If you haven't run pip install -e ., you need to prefix all CLI commands with PYTHONPATH='.'.

If everything is installed correctly, this command should execute successfully:

maze-run -cn conf_train env=maze_smaac model=maze_smaac_nets wrappers=maze_smaac_debug +experiment=sac_dev

Training

Via CLI

Start training in train mode:

maze-run -cn conf_train env=maze_smaac model=maze_smaac_nets wrappers=maze_smaac_train +experiment=sac_train

Start training in debug mode:

maze-run -cn conf_train env=maze_smaac model=maze_smaac_nets wrappers=maze_smaac_debug +experiment=sac_dev

Perform rollout of trained policy:

maze-run policy=torch_policy env=maze_smaac model=maze_smaac_nets wrappers=maze_smaac_rollout runner=evaluation input_dir=EXPERIMENT_LOGDIR

Via Python API

For invoking training, evaluation and rollout in Python, run the Python script utilizing Maze' Python API:

python scripts/training.py

We encourage you to use the snippets in training.py as a starting point to customize the training configuration and write your own scripts.

enlite-ai/maze_smaac