EtherCAT shared memory issue when controlling multiple arms
Closed this issue · 2 comments
Hi Yihuai, ran into this while testing the new Ecat connection.
When attempting to control multiple arms in one Python process, the started connections interfere with each other in unpredictable ways. Sometimes this is with the second connection overriding the first one, sometimes the second connection fails to start up. I believe this can be dangerous to the robots if it leads to invalid commands sent to the motors (my arms are ok though so no harm caused).
I noticed this problem does not occur when controlling the arms separately from separate processes. I managed to replicate the issue by the following code - start the arm connections either in separate threads or in separate processes:
separate threads - issue occurs
def run_1():
arx5_0 = arx5.Arx5JointController("L5", "enx____1")
arx5_0.enable_background_send_recv()
arx5_0.reset_to_home()
arx5_0.enable_gravity_compensation("../models/arx5.urdf")
time.sleep(5)
def run_2():
arx5_1 = arx5.Arx5JointController("L5", "enx____2")
arx5_1.enable_background_send_recv()
arx5_1.reset_to_home()
arx5_1.enable_gravity_compensation("../models/arx5.urdf")
time.sleep(5)
def main():
np.set_printoptions(precision=3, suppress=True)
# robot_config = arx5_0.get_robot_config()
# controller_config = arx5_0.get_controller_config()
import threading
t1 = threading.Thread(target=run_1)
t2 = threading.Thread(target=run_2)
t1.start()
time.sleep(2)
t2.start()
time.sleep(5)
t1.join()
t2.join()
exit()
separate processes - ok, no issues
def run_1():
arx5_0 = arx5.Arx5JointController("L5", "enx____1")
arx5_0.enable_background_send_recv()
arx5_0.reset_to_home()
arx5_0.enable_gravity_compensation("../models/arx5.urdf")
time.sleep(5)
def run_2():
arx5_1 = arx5.Arx5JointController("L5", "enx____2")
arx5_1.enable_background_send_recv()
arx5_1.reset_to_home()
arx5_1.enable_gravity_compensation("../models/arx5.urdf")
time.sleep(5)
def main():
np.set_printoptions(precision=3, suppress=True)
# robot_config = arx5_0.get_robot_config()
# controller_config = arx5_0.get_controller_config()
import multiprocessing
t1 = multiprocessing.Process(target=run_1)
t2 = multiprocessing.Process(target=run_2)
t1.start()
time.sleep(2)
t2.start()
time.sleep(5)
t1.join()
t2.join()
exit()
Because separate processes fixes the issue (and is a workaround) I suspect there is an issue with shared memory somewhere in the c++ code or in the CAN / ECAT libraries. The issue might also occur with CAN connections, I just haven't been able to test it.
Again, thanks for the good work with this library!
Hi Ville, thanks for pointing out this issue!
Previouly I've been using USB-CAN adapters for multiple arms and the code works well. For the EtherCAT interface, I've been using the global variables provided by the SOEM library. I just modified the code and now it's using local variables. Hopefully this will address the issue.
However, I have only one arm right now so I'm unable to test it. Could you checkout the yihuai
branch and test the code on multiple arms? Since each controller for an arm has a thread running in the background, you don't need to use python (fake) multithreading in your python code. You can refer to python/examples/test_bimanual.py
(remember to modify the robot model and interface names).
I also noticed another problem for EtherCAT-CAN interface: the arm sometimes disconnects without any error messages and stuck at a random pose (USB-CAN doesn't have this problem). Have you ever seen the same behavior? I'll merge the code into the main branch after this problem is solved.
Merged