/quanser-openai-driver

OpenAI Gym wrapper for the Quanser Qube and Quanser Aero

Primary LanguagePythonMIT LicenseMIT

Quanser OpenAI Driver

Has an OpenAI Gym wrapper for the Quanser Qube Servo 2 and Quanser Aero

Setup

We have tested on Ubuntu 16.04 LTS and Ubuntu 18.04 LTS using Python 2.7 and Python 3.6.5

Prerequisites

Install the HIL SDK from Quanser.
A mirror is available at https://github.com/quanser/hil_sdk_linux_x86_64.

You can install the driver by:

    git clone https://github.com/quanser/hil_sdk_linux_x86_64.git
    sudo chmod a+x ./hil_sdk_linux_x86_64/setup_hil_sdk ./hil_sdk_linux_x86_64/uninstall_hil_sdk
    sudo ./hil_sdk_linux_x86_64/setup_hil_sdk

You also must have pip installed:

    sudo apt-get install python3-pip

Installation

We recommend that you use a virtual environment such as conda (recommended), virtualenv, or Pipenv

You can install the driver by cloning and pip-installing:

    git clone https://github.com/BlueRiverTech/quanser-openai-driver.git
    cd quanser-openai-driver
    pip3 install -e .

Once you have that setup: Run the classical control baseline (ensure the Qube is connected to your computer)

python tests/test.py --env QubeSwingupEnv --controller flip

Usage

Usage is very similar to most OpenAI gym environments but requires that you close the environment when finished. Without safely closing the Env, bad things may happen. Usually you will not be able to reopen the board.

This can be done with a context manager using a with statement

import gym
from gym_brt import QubeSwingupEnv

num_episodes = 10
num_steps = 250

with QubeSwingupEnv() as env:
    for episode in range(num_episodes):
        state = env.reset()
        for step in range(num_steps):
            action = env.action_space.sample()
            state, reward, done, _ = env.step(action)

Or can be closed manually by using env.close(). You can see an example here.

Environments

Information about various environments can be found in docs/envs and our whitepaper.

Control

Information about baselines can be found in docs/control.

Hardware Wrapper

Information about the Python wrapper for Quanser hardware and Qube Servo 2 simulator can be found in docs/quanser and our whitepaper.

Citing

If you use this in your research please cite the following whitepaper:

@misc{2001.02254,
  author = {{Polzounov}, Kirill and {Sundar}, Ramitha and {Redden}, Lee},
  title = "{Blue River Controls: A toolkit for Reinforcement Learning Control Systems on Hardware}",
  year = {2019},
  eprint = {arXiv:2001.02254},
  howpublished = {Accepted at the Workshop on Deep Reinforcement Learning at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.}
}