ros2/rmw_cyclonedds

ROS2 communication stopped between nodes on same machine when network is down

robert-preissl opened this issue · 3 comments

Bug report

Required Info:

  • Operating System:
    • Ubuntu 22.04.03
  • Installation type:
    • binaries (running the latest ros2/iron docker image)
  • Version or commit hash:
    • ros2 Iron
  • DDS implementation:
    • ros-iron-cyclonedds 0.10.4-1jammy
    • ros-iron-rmw-cyclonedds-cpp 1.6.0-2jammy
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

The setup is as follow: using one computer, connected to a wifi network, two terminals both running a ros2 iron docker container (with --network=host)
(the wireless interface as well as the loopback lo interface are configured as Multicast)

  • run in one terminal the ros2 listener node (environment variables are the RMW_IMPLEMENTATION= rmw_cyclonedds_cpp):
    ros2 run demo_nodes_cpp listener

  • in a second terminal run the ros2 talker node (same environment variables as above):
    ros2 run demo_nodes_cpp talker

  • the terminal with the listener prints Hello World messages with increasing ids. (1,2,3, etc.)

  • Now, we disconnect the wifi on the computer

  • the terminal with the listener does not print new Hello world messages

  • the terminal of the talker prints error messages like tev: ddsi_upd_conn_write to udp/192.168.128.127:21163 failed with retcode -3

Expected behavior

  • since all the communication is local (i.e., on the same machine), an interruption in the wifi network should not interrupt local ROS communication

Actual behavior

  • communication from the talker to the listener (all on the same host -- but different docker containers) is impacted.

@eboasson may be able to comment here

I do have some ideas, but I do need to do a bit of checking to be sure what exactly is causing it, and what would be the solution (workaround, change to Cyclone, whatever). I did want to let you know I am now—belatedly—aware of the issue.