Mobile Robot Control via Goal-Conditioned Reinforcement Learning

A collection of mobile robot environments and their goal-conditioned reinforcement learning controllers.

Setup

Install the package via pip:

git clone https://github.com/ZikangXiong/mobrob
cd mobrob
pip install -e .

This project partially relies on mujoco-py. Follow the official guide to setup. mujoco-py depends on glew, mesalib, and glfw3. If you do not have permission to install these dependencies, you may use conda to circumvent this issue:
conda install -c conda-forge glew 
conda install -c conda-forge mesalib 
conda install -c menpo glfw3

Features

Environments:

This repository provides five mobile robot environments:

Body Type	Description	Simulator	State dim	Action dim	Control type	Video
point	point mass	mujoco-py	14	2	Continuous Commands	point.mp4
car	car-like kinematics	mujoco-py	26	2	Continuous Commands	car.mp4
doggo	quadruped dog kinematics	mujoco-py	58	12	Continuous Commands	doggo.mp4
drone	drone kinematics	pybullet	12	18	Neural PID	drone.mp4
turtlebot3	turtlebot3-waffle kinematics	pybullet	43	2	Neural Prop	turtlebot3.mp4

Continuous Commands: continuous control commands are generated by the control policy directly.
Neural PID: a neural network maps the current state to the desired PID coefficients.
Neural Prop: a neural network maps the current state to the desired proportional control coefficients.

Reinforcement Learning Controllers:

Controllers are trained using Proximal Policy Optimization (PPO).

Pretrained policies: Available at data/policies.
Training parameters: Available at data/configs. Refer to stable-baselines3 PPO for all supported parameters.
Training: Use scripts in examples/train.py. For instance, to train the point robot:

python examples/train.py --env-name point

To finetune a trained policy:

python examples/train.py --env-name point --finetune

Training logs and intermediate policies are saved in data/tmp.

Evaluation: Use scripts in examples/control.py. For instance, to evaluate the point robot:

python examples/control.py --env-name point

To disable the GUI in case you are running the code on a remote server:

python examples/control.py --env-name point --no-gui

or you can consider use pyvirtualdisplay, and store the video.

Customization

For users intending to build their goal-conditioned environments, the following abstract functions in the abstract EnvWrapper (in wrapper.py) should be rewritten according to the specific needs of the new robot environment. The functions, along with their brief explanations, are given in the table below:

Function Name	Description
`_set_goal(self, goal)`	Sets the goal position of the robot. Example: [x, y, z]
`build_env(self)`	Constructs the environment, i.e., loads the robot and the world.
`get_pos(self)`	Retrieves the current position of the robot. Example: [x, y, z]
`set_pos(self, pos)`	Sets the position of the robot. Example: [x, y, z]
`get_obs(self)`	Obtains the current observation of the robot. Example: [x, y, z, r, p, y]
`get_observation_space(self)`	Gets the observation space of the robot. Example: Box(58,)
`get_action_space(self)`	Retrieves the action space of the robot. Example: Box(12,)
`get_init_space(self)`	Fetches the initial space of the robot. Example: Box(3,)
`get_goal_space(self)`	Acquires the goal space of the robot. Example: Box(3,)

One may refer to the other robot environment wrappers in wrapper.py for more details.

Publications

This repository is used in the following papers as the benchmark environment:

@inproceedings{mfnlc,
  author={Xiong, Zikang and Eappen, Joe and Qureshi, Ahmed H. and Jagannathan, Suresh},
  booktitle={2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, 
  title={Model-free Neural Lyapunov Control for Safe Robot Navigation}, 
  year={2022},
  pages={5572-5579},
  doi={10.1109/IROS47612.2022.9981632}}

@article{dscrl,
  title={Co-learning Planning and Control Policies Using Differentiable Formal Task Constraints},
  author={Xiong, Zikang and Eappen, Joe and Lawson, Daniel and Qureshi, Ahmed H and Jagannathan, Suresh},
  journal={arXiv preprint arXiv:2303.01346},
  year={2023}
}