This repository contains the code for my open-source project "Deep Reinforcement Learning for Systems Research", which was part of the module "R244: Large-scale data processing and optimisation" in Michaelmas term 2022 at the University of Cambridge. The project explored how researchers and practioners can utilize the open-source libraries Park and RLlib for systems and network research. Park takes a similar approach to Gymasium and provides several systems-based environments to test new reinforcement learning algorithms. An introduction to the library can be found in the following paper:
Hongzi Mao et al. “Park: An Open Platform for Learning Augmented Computer Systems”. In: Advances in Neural Information Processing Systems. Ed. by H. Wallach et al. Vol. 32. Curran Associates, Inc., 2019.
RLlib is a library built on top of the distributed execution engine Ray and can be used to design, implement, train, optimize and test existing as well as novel deep reinforcement learning algorithms. The following paper explains RLlib's main underlying ideas and concepts:
Eric Liang et al. “RLlib: Abstractions for Distributed Reinforcement Learning”. In: Proceedings of the 35th International Conference on Machine Learning. Ed. by Jennifer Dy and Andreas Krause. Vol. 80. Proceedings of Machine Learning Research. PMLR, July 2018, pp. 3053–3062.
As part of the open-source project, I extended the main Park repository in the following ways:
- Added support for Dict spaces in Park
- Implemented a custom
ProcessScheduling
environment using Park and Gym - Implemented three baseline scheduling agents for the
ProcessScheduling
environment in Park and Gym - Extended an existing Park -> Gym space/environment wrapper to Dict, Tuple and Graph spaces
- Implemented a
test_environments()
function to automatically check which Park environments can be successfully initialized, wrapped in a Gym environment and are compatible with RLlib - Added experimental code for optimizing hyperparameters of deep reinforcement learning algorithms using RLlib for Park and CompilerGym environments
Park environments can be divided into two distinct groups: real and simulated ones. Real environments are executed directly on the user's system, whereas simulated ones rely on, for example, traces of previous environment observations. However, not all (real) Park environments can be executed on each hardware device / operating system. Furthermore, Park is also not actively maintained anymore, which results in additional issues for some environments.
Another problem is that although Park exposes a similar interface to the one popularized by OpenAI Gym, it is not built on top of Gymnasium. Instead Park environments utilize custom spaces, resulting in compatibility issues with many other libraries such as RLlib. While some environments can be made compatible by converting the Park spaces to Gym spaces, others can't. The table below provides an overview of which Park environments can be initialized successfully, converted to a Gym environment and are compatible with RLlib. The tests were performed using MacOS Ventura 13.1 and Ubuntu 20.04.5 LTS.
Environment | ID | Instantiation Possible | Gymnasium Compatibility | RLlib Compatibility |
---|---|---|---|---|
Adaptive video streaming | abr | ❌ | - | - |
Adaptive video streaming | abr_sim | ❌ | - | - |
Network active queue management | aqm | ✅ | ❌ | - |
CDN memory caching | cache | ✅ | ✅ | ❌ |
Circuit design | circuit_three_ stage_transimpedance | ✅ | ✅ | ❌ |
Network congestion control | congestion_control | ❌ | - | - |
Server load balancing | load_balance | ✅ | ✅ | ✅ |
Multi-dim database indexing | multi_dim_index | ✅ | ❌ | - |
SQL database query optimization | query_optimizer | ❌ | - | - |
Account region assignment | region_assignment | ✅ | ❌ | - |
Simple queue | simple_queue | ✅ | ✅ | ✅ |
Spark cluster job scheduling | spark | ❌ | - | - |
Spark cluster job scheduling | spark_sim | ✅ | ❌ | - |
Switch scheduling | switch_scheduling | ✅ | ✅ | ✅ |
Tensorflow device placement | tf_placement | ❌ | - | - |
Tensorflow device placement | tf_placement_sim | ✅ | ❌ | - |
Process scheduling | process_scheduling | ✅ | ✅ | ❌ |
As we can see, the initialization succeeds for 11 out of the 17 Park environments, 6 of them can be converted to Gym environments and only 3 are compatible with RLlib. You can re-run the tests on your own device by using the test_environments()
function described below.
-
Clone this repository
git clone https://github.com/jakobhartmann/deep-rl-system-research.git
-
Change directory
cd deep-rl-system-research
-
Install dependencies
pip install -r requirements.txt
from env_wrapper import test_environments
envs = ['abr', 'abr_sim', 'aqm', 'cache', 'circuit_three_stage_transimpedance', 'congestion_control', 'load_balance', 'query_optimizer', 'multi_dim_index', 'region_assignment', 'simple_queue', 'spark', 'spark_sim', 'switch_scheduling', 'tf_placement', 'tf_placement_sim', 'process_scheduling']
env_summary = test_environments(envs = envs)
print(env_summary)
The function returns a dictionary in which each key corresponds to one environment. The value can take three possible forms:
Success!
The initialization succeeded, the conversion to a Gym environment worked and the environment is compatible with RLlib.NotImplementedError
The initialization succeeded, but at least one Park space is not supported by Gymnasium.- Any other error message: The initialization of the Park environment failed.
import park
from process_scheduling_park_agents import *
env = park.make('process_scheduling')
done = False
obs, info = env.reset()
while not done:
act = sjf_agent(obs) # Use Shortest-Job-First agent
obs, reward, done, info = env.step(act)
Alternatively, you can also run the given example:
python process_scheduling_park_agents.py
import gym
from gym.envs.registration import register
from process_scheduling_gym.process_scheduling_gym_agents import *
register(
id='ProcessScheduling-v0',
entry_point='process_scheduling_gym.process_scheduling_gym_env:ProcessSchedulingEnv',
max_episode_steps=1000,
)
env = gym.make('process_scheduling_gym.process_scheduling_gym_env:ProcessScheduling-v0')
done = False
obs, info = env.reset()
while not done:
act = sjf_agent(obs) # Use Shortest-Job-First agent
obs, reward, done, info = env.step(act)
Alternatively, you can also run the given example:
python process_scheduling_gym/process_scheduling_gym_agents.py
from env_wrapper import ParkAgent
from ray import rllib
park_agent = ParkAgent({'name': 'process_scheduling'})
rllib.utils.check_env(park_agent)