Deep Reinforcement Learning for Systems Research

Introduction

This repository contains the code for my open-source project "Deep Reinforcement Learning for Systems Research", which was part of the module "R244: Large-scale data processing and optimisation" in Michaelmas term 2022 at the University of Cambridge. The project explored how researchers and practioners can utilize the open-source libraries Park and RLlib for systems and network research. Park takes a similar approach to Gymasium and provides several systems-based environments to test new reinforcement learning algorithms. An introduction to the library can be found in the following paper:

Hongzi Mao et al. “Park: An Open Platform for Learning Augmented Computer Systems”. In: Advances in Neural Information Processing Systems. Ed. by H. Wallach et al. Vol. 32. Curran Associates, Inc., 2019.

RLlib is a library built on top of the distributed execution engine Ray and can be used to design, implement, train, optimize and test existing as well as novel deep reinforcement learning algorithms. The following paper explains RLlib's main underlying ideas and concepts:

Eric Liang et al. “RLlib: Abstractions for Distributed Reinforcement Learning”. In: Proceedings of the 35th International Conference on Machine Learning. Ed. by Jennifer Dy and Andreas Krause. Vol. 80. Proceedings of Machine Learning Research. PMLR, July 2018, pp. 3053–3062.

As part of the open-source project, I extended the main Park repository in the following ways:

Added support for Dict spaces in Park
Implemented a custom ProcessScheduling environment using Park and Gym
Implemented three baseline scheduling agents for the ProcessScheduling environment in Park and Gym
Extended an existing Park -> Gym space/environment wrapper to Dict, Tuple and Graph spaces
Implemented a test_environments() function to automatically check which Park environments can be successfully initialized, wrapped in a Gym environment and are compatible with RLlib
Added experimental code for optimizing hyperparameters of deep reinforcement learning algorithms using RLlib for Park and CompilerGym environments

Overview of Park Environments

Park environments can be divided into two distinct groups: real and simulated ones. Real environments are executed directly on the user's system, whereas simulated ones rely on, for example, traces of previous environment observations. However, not all (real) Park environments can be executed on each hardware device / operating system. Furthermore, Park is also not actively maintained anymore, which results in additional issues for some environments.

Another problem is that although Park exposes a similar interface to the one popularized by OpenAI Gym, it is not built on top of Gymnasium. Instead Park environments utilize custom spaces, resulting in compatibility issues with many other libraries such as RLlib. While some environments can be made compatible by converting the Park spaces to Gym spaces, others can't. The table below provides an overview of which Park environments can be initialized successfully, converted to a Gym environment and are compatible with RLlib. The tests were performed using MacOS Ventura 13.1 and Ubuntu 20.04.5 LTS.

Environment	ID	Instantiation Possible	Gymnasium Compatibility	RLlib Compatibility
Adaptive video streaming	abr	❌	-	-
Adaptive video streaming	abr_sim	❌	-	-
Network active queue management	aqm	✅	❌	-
CDN memory caching	cache	✅	✅	❌
Circuit design	circuit_three_ stage_transimpedance	✅	✅	❌
Network congestion control	congestion_control	❌	-	-
Server load balancing	load_balance	✅	✅	✅
Multi-dim database indexing	multi_dim_index	✅	❌	-
SQL database query optimization	query_optimizer	❌	-	-
Account region assignment	region_assignment	✅	❌	-
Simple queue	simple_queue	✅	✅	✅
Spark cluster job scheduling	spark	❌	-	-
Spark cluster job scheduling	spark_sim	✅	❌	-
Switch scheduling	switch_scheduling	✅	✅	✅
Tensorflow device placement	tf_placement	❌	-	-
Tensorflow device placement	tf_placement_sim	✅	❌	-
Process scheduling	process_scheduling	✅	✅	❌

As we can see, the initialization succeeds for 11 out of the 17 Park environments, 6 of them can be converted to Gym environments and only 3 are compatible with RLlib. You can re-run the tests on your own device by using the test_environments() function described below.

Usage

Getting started

Clone this repository

git clone https://github.com/jakobhartmann/deep-rl-system-research.git

Change directory
```
cd deep-rl-system-research
```
Install dependencies
```
pip install -r requirements.txt
```

Check compatibility of Park environments with Gym wrapper and RLlib

from env_wrapper import test_environments
envs = ['abr', 'abr_sim', 'aqm', 'cache', 'circuit_three_stage_transimpedance', 'congestion_control', 'load_balance', 'query_optimizer', 'multi_dim_index', 'region_assignment', 'simple_queue', 'spark', 'spark_sim', 'switch_scheduling', 'tf_placement', 'tf_placement_sim', 'process_scheduling']
env_summary = test_environments(envs = envs)
print(env_summary)

The function returns a dictionary in which each key corresponds to one environment. The value can take three possible forms:

Success! The initialization succeeded, the conversion to a Gym environment worked and the environment is compatible with RLlib.
NotImplementedError The initialization succeeded, but at least one Park space is not supported by Gymnasium.
Any other error message: The initialization of the Park environment failed.

Run the ProcessScheduling Park environment

import park
from process_scheduling_park_agents import *
env = park.make('process_scheduling')
done = False
obs, info = env.reset()
    
while not done:
    act = sjf_agent(obs) # Use Shortest-Job-First agent
    obs, reward, done, info = env.step(act)

Alternatively, you can also run the given example:

python process_scheduling_park_agents.py

Register and run the ProcessScheduling Gym environment

import gym
from gym.envs.registration import register
from process_scheduling_gym.process_scheduling_gym_agents import *

register(
    id='ProcessScheduling-v0',
    entry_point='process_scheduling_gym.process_scheduling_gym_env:ProcessSchedulingEnv',
    max_episode_steps=1000,
)

env = gym.make('process_scheduling_gym.process_scheduling_gym_env:ProcessScheduling-v0')
    
done = False
obs, info = env.reset()

while not done:
    act = sjf_agent(obs) # Use Shortest-Job-First agent
    obs, reward, done, info = env.step(act)

Alternatively, you can also run the given example:

python process_scheduling_gym/process_scheduling_gym_agents.py

Manually check compatibility of a Park environment with RLlib

from env_wrapper import ParkAgent
from ray import rllib

park_agent = ParkAgent({'name': 'process_scheduling'})
rllib.utils.check_env(park_agent)