Python package to control the Dobot Magician Robotic Arm using a trained policy.
This repository currently provides -
- A python class to interface and control the Dobot Arm
- Policies trained on Gym Fetch Environments
- Scipts to control the Dobot Arm with these policies
The documentation section describes each of these in detail.
Ideally, the policies should be trained on environments provided by gym-dobot, but that is currently WIP.
- python 3 (Tested on Python 3.6)
- gym > 0.10.3
- mujoco_py > 1.5
- mujoco - mjpro150
- baselines - 0.1.5
The Gym Environment used for this project makes use of additional functions to map from sim to the real world, set goals/objects etc. So use this modified fork of gym instead. If this fork is way too behind the original repo or if you don't want to reinstall gym, then please refer the functions mentioned here in the modified fork and manually add them to your gym installation.
git clone https://github.com/WarrG3X/dobot-rl
cd dobot-rl
pip install -e .
First step is to ensure that the arm is controllable. Connect the USB cable and use ls /dev
to find which port it is using. Now make sure you have the proper access permission for this port by using chmod
. You'll have to do this each time you plug in the dobot. A much more convenient way would be to simply add the user to the dialout
group.
Now use the dobot-cli script to test it.
Note - Before running, ensure that the arm has enough space around it, as it moves around a bit for calibrating the initial position. This must be kept in mind whenever the arm is being initialized.
#Default Port = ttyUSB0
python -m dobot_rl.scripts.dobot_cli
#For Any other port
python -m dobot_rl.scripts.dobot_cli --port=<ttyUSBX>
Try typing the commands shown in the dobot>
prompt to move the dobot-arm and test the gripper. If the arm doesn't move it's probably because the given coordinates are out of bound. For example, try m 230 0 0 0
for moving and g 1 1
for the gripper.
Currently the dobot arm can do the Reach task and the PickAndPlace task.
#To launch Reach
python -m dobot_rl.scripts.run_policy --robot=1 --env=FetchPickAndPlace-v1 --policy_file=fetch_pick_policy_best.pkl
#To launch PickAndPlace
python -m dobot_rl.scripts.run_policy --robot=1 --env=FetchReach-v1 --policy_file=fetch_reach_policy_best.pkl
#To show available flags / additonal options
python -m dobot_rl.scripts.run_policy --help
Note - Ensure that the gripper is sideways with respect to the arm. Also preferably use a soft object (like Foam/Sponge) for the PickAndPlace task. Using hard objects is not recommended as the policy is not perfect and in some cases the arm tries to push into the object which can damage the arm. The most crucial part for the task to be succesfully executed is mapping from the sim to the real world, which is described below.
This section describes each aspect of the project in detail.
The DobotController
class in dobot_controller.py
is used to interface with the arm. For the dobot arm it is necessary that the underlying api that is being used is properly disconnected after each use which is ensured by this class. As mentioned in the usage section always ensure that the correct port is being used and the user has proper priveleges.
The DobotController
class currently provides two functions -
movexyz(x,y,z,r,q)
- Move to pos x,y,z with rotation r.
Note - There is some kind of bug with this method that the rotation, r only works for the first move command and then the gripper stops rotating for subsequent commands. But for now this isn't a major issue as the rotation has to be only set once in the beginning.
grip(grip,t,q)
- grip is a binary value (0/1) where 1 denotes closed and 0 denotes open. t refers to the duration for which to enable the vacuum pump. The default t=0.5 is enough to completely open and a close the gripper. A smaller value such as 0.35 will partially open/close the gripper, but usually the value must be greater than 0.2 to have any effect. It would be quite useful if a mapping from time duration to how wide the gripper opens is implemented in the future as that will provide much more precise control for harder tasks.
For both of these, q denotes whether the command is queued or not. Refer the Dobot API for a detailed description.
Currently only these both methods are implemented as these are sufficient to perform the required tasks. The dobot_helper_functions.py
in the legacy_scripts
folder has some additional methods which weren't re-implemented in dobot_controller.py
, but can be used as a reference to extend the functionality of DobotController
class.
All the files needed for controlling the dobot arm are located in the utils
folder -
dobot_controller.py
- ContainsDobotController
classlibDobotDll.so.1.0.0
- The library needed by the API. Without this the API won't work.DobotDllType.py
- The main file that actually implements the Dobot API. It is quite exhaustive and implements all the methods refered in the Dobot API Description. It is also responsible for correctly loading the Dll file. It is currently set to first search in theutils
folder and if that fails, it refers to theLD_LIBRARY_PATH
. So if you move this file somewhere else, ensure thatlibDobotDll.so.1.0.0
can be found in one of the locations inLD_LIBRARY_PATH
.
As mentioned in the Usage/Dobot Arm section, dobot-cli
is a convenient to script to test the working of the dobot-arm
Mapping is needed for translating coordinates in the simulation to real world coordinates. The mapping is done by two functions, sim2real
and real2sim
which are defined in fetch_env.py in the modified gym fork. These functions are also defined in dobot_env.py in the gym-dobot package.
The range of the fetch robotic arm is much greater than the dobot arm, thus the coordinates are downscaled to map to the real world dobot arm. These functions also clip the input values so that the arm doesn't go out of range.
If these functions are modified, to test the mapping, use the fetch_viz.py
script in the scripts
folder. It provides a Tkinter
based GUI, to control a fetch arm using a trained reach policy by moving around the goal position.
#To run a basic visualization.
#The Tkinter GUI will have sliders for x,y,z coordinates which can be used to control the robot.
python -m dobot_rl.scripts.fetch_viz
#To control the robot along with the visualization. Press the Move Dobot button to move the robot and press the Grip button to toggle the gripper position.
python -m dobot_rl.scripts.fetch_viz --robot=1
The main objective is to ensure that the robot in the simulation is able to reach every point on the table and real world robot can follow the same.
The policies
folder contains policies trained for the FetchPush
, FetchPickAndPlace
and FetchReach
environments. These policies were trained using openai/baselines. Currently the above code has only been tested for the FetchPickAndPlace
and FetchReach
environments.
As stated in the introduction, Ideally, the policies should be trained on environments provided by gym-dobot, but that is currently WIP.
The run_policy
script in the scripts
directory actually controls the robot using a trained a policy.
Run python run_policy.py --help
from within the scripts
directory to see the various flags/options.
The basic working of the script is as follows -
-
Load policy file
-
Initialize
DobotController
-
Set Time Horizon
T
. This should be set to the same value asmax_episode_steps
defined in__init__.py
ingym/envs/
folder as the main loop is run forT
iterations. It is50
by default. -
For each test rollout, the environment is reset. The
env.reset()
function has been modified to take an addtional parametergoal=[x,y,z]
to initialize the environment with that goal. Also a new functionenv.set_object(pos)
has been added to similarly initialize the object position. Refer fetch_env.py and robot_env.py to see their implementations. -
For each timestep in a episode, we first get the
policy_output
by usingpolicy.get_actions()
function and then theobs
by callingstep(policy_output)
on theenv
. The actualobservation
is stored in a variableo
which is an array of values. The first three values correspond to thex,y,z
of thegrip_site
. So we setpos = env.sim2real(o[:3])
to obtain the actual gripper position after converting them to real world coordinates. This value is appended to a list calledpoints
which stores the gripper position for each timestep. -
For each timestep we also need the corresponding gripper status
[open/close]
. The fourth value of thepolicy_output
corresponds to the gripper value and ranges from-1
to1
. We first convert it to a binary0/1
value where0=closed
and1=open
. So we XOR the value with 1 to flip the value so that0=open
and1=closed
which is in accordance with thegrip
function inDobotController
. This value is appended to a list calledgrips
which stores the gripper position for each timestep. -
Now both the lists,
points
andgrips
have 50 sets of values each. But if we simply write all 50 values to the Dobot, it would very be slow. Thus we first use a line simplification algorithm calledVisvalingam-Whyatt
for poly-line vertex reduction. A working implementation of the algorithm has been taken from here and slightly modified for our use case. Referpolysimplify
script inutils
directory for its implementation. We define a valueN
which denotes the no. of points to extract.N=4
for theReach
Environment andN=10
for thePick
Environment. We store the reduced no. of points obtained from the simplifier intraj_points
. -
Now for the
Pick
task, there are many coordinates which are important for the gripper to be able to pick the object. (Eg - The coordinate just above an object where the gripper can close to grasp the object). But the line simplifier approximates the trajectory to10
points so it is not necessary that such coordinates will be preserved. As mentioned earlier each point is stored inpoints
with its corresponding gripper status ingrips
. So we go throughpoints
and select each point where the value ingrips
is different from the previous value. This is because the points where the gripper status changes, are points where the gripper is actually opening/closing. So this second set of relevant points is stored ingrip_points
. -
Finally we take the union of both
traj_points
andgrip_points
calledfinal_points
that gives us a reduced set of values that both preserves the trajectory and the positions relevant for the gripper. Note thatgrip_points
are not calculated in the case ofReach/Push
tasks as the gripper is blocked. For each point[x,y,z]
infinal_points
and its corresponding gripper statusg
ingrips
we calldobot.movexyz(x,y,z,r)
anddobot.grip(g)
wheredobot
is theDobotController
object.