ROS 2 message flow analysis experiments

ROS 2 message flow analysis experiments using ros2_tracing and Eclipse Trace Compass.

This is part of the ROS 2 message flow paper. If you use or refer to this method or this repository, please cite:

C. Bédard, P.-Y. Lajoie, G. Beltrame, and M. Dagenais, "Message Flow Analysis with Complex Causal Links for Distributed ROS 2 Systems," Robotics and Autonomous Systems, vol. 161, p. 104361, 2023.

BibTeX:

@article{bedard2023messageflow,
  title={Message flow analysis with complex causal links for distributed {ROS} 2 systems},
  author={B{\'e}dard, Christophe and Lajoie, Pierre-Yves and Beltrame, Giovanni and Dagenais, Michel},
  journal={Robotics and Autonomous Systems},
  year={2023},
  volume={161},
  pages={104361},
  doi={10.1016/j.robot.2022.104361}
}

Relevant repositories

ros2_tracing: tracing instrumentation and launch tools for ROS 2
- repository
- branch: message-link-instrumentation
DDS implementations
- Fast DDS
  - repository
  - branch: instrumentation-lttng
- Cyclone DDS
  - repository
  - branch: instrumentation-lttng
Experimentation-related
- Message flow test cases
  - repository
- Autoware reference system
  - repository
  - branch: message-link-instrumentation

Experiments

For all systems:

Setup system to build ROS 2 and enable tracing
- https://docs.ros.org/en/rolling/Installation/Ubuntu-Development-Setup.html
- https://github.com/ros2/ros2_tracing
  - The LTTng kernel tracer will be required for some experiments (examples and experiment 1)

Examples

See https://github.com/christophebedard/ros2-message-flow-test-cases.

For each of the 2 systems
1. Make sure that the LTTng kernel tracer is installed
  - https://github.com/ros2/ros2_tracing#building
2. Setup code workspaces and build
```
./exp-1_setup_workspace.sh
```
  - We use the same workspace as experiment 1

Run examples

First run the single-system examples on the system of your choice

source exp-1_ws/install/setup.bash
ros2 launch ros2_message_flow_testcases examples/example-2_trivial.launch.py
ros2 launch ros2_message_flow_testcases examples/example-3_periodic_async.launch.py
ros2 launch ros2_message_flow_testcases examples/example-4_partial_sync.launch.py

Then run the distributed example over 2 systems

On system 1

source exp-1_ws/install/setup.bash
ros2 launch ros2_message_flow_testcases examples/example-1_transport_1.launch.py

On system 2

source exp-1_ws/install/setup.bash
ros2 launch ros2_message_flow_testcases examples/example-1_transport_2.launch.py

The order does not really matter

Data will be written to examples/trace-example-*-YYYYMMDDTHHMMSS

Autoware reference system

In this experiment, we run and trace the Autoware reference system proposed by the ROS 2 Real-Time Working Group. We first run it in a single process on a single system, and then distribute it over multiple processes over 2 systems.

For each of the 2 systems
1. Make sure that the LTTng kernel tracer is installed
  - https://github.com/ros2/ros2_tracing#building
2. Setup code workspaces and build
```
./exp-1_setup_workspace.sh
```
  - this creates a workspace and builds it in release mode
  - the workspace includes all of ROS 2 from source, as well as some additional repos and specific branches for some of the ROS 2 repos (see reference_system.repos)

Run experiment

On a single system

source exp-1_ws/install/setup.bash
ros2 launch experiment-1/reference_system.launch.py

Distributed over 2 systems

On system 1

source exp-1_ws/install/setup.bash
ros2 launch experiment-1/reference_system_1.launch.py

On system 2

source exp-1_ws/install/setup.bash
ros2 launch experiment-1/reference_system_2.launch.py

The order does not really matter

Variant: launch the same system again, but with reference_system_1b.launch.py for system 1, which uses a multi-threaded executor for one of the most critical processes
- On system 1
```
source exp-1_ws/install/setup.bash
ros2 launch experiment-1/reference_system_1b.launch.py
```
- On system 2
```
source exp-1_ws/install/setup.bash
ros2 launch experiment-1/reference_system_2.launch.py
```
Experiment data will be written to experiment-1/trace-reference-system*-YYYYMMDDTHHMMSS

Analyze the traces
- See Analysis

RTAB-Map

In this experiment, we distribute and run RTAB-Map over 2 systems and trace it. We have 4 components: the camera driver node, the odometry node, the RTAB-Map node, and rviz. These can be split up into two separate groups, one for each system.

For each of the 2 systems
1. Setup rtabmap_ros using the ros2 branch
  - Follow the build instructions
2. Prepare camera and driver
  - We use an Intel RealSense D400, so we use the realsense_d400.launch.py launch file
Synchronize system clocks
1. Using NTP or PTP

Modify launch files

Add Trace action to existing launch files to trace the system when executing them: realsense_d400.launch.py and rtabmap.launch.py

# ...
from tracetools_launch.action import Trace
# ...
return LaunchDescription([
    # Tracing
    Trace(
        session_name='rtabmap-kitti',
        events_ust=[
            'ros2:*',
            'dds:*',
        ],
        events_kernel=[],
    ),
    # ...
])
# ...

Run experiment
- On system 1
```
ros2 launch rtabmap_ros realsense_d400.launch.py
```
- On system 2
```
ros2 launch rtabmap_ros rtabmap.launch.py
```
- The rtabmap.launch.py launch file can be modified to launch the *_odometry node and the rtabmap node separately
  - The *_odometry node can then be run on system 1
- Launch rviz on system 2 for visualization
- Experiment data will be written to ~/.ros/tracing/rtabmap-kitti on each system
Analyze the traces
- See Analysis

Overhead

In this experiment, we evaluate the end-to-end latency for a typical system when tracing is disabled and when it is enabled. The end-to-end latency difference is the overhead.

For each of the 2 systems
1. Workspace with tracing
```
./exp-1_setup_workspace.sh
```
  - We use the same workspace as experiment 1
2. Workspace without any tracepoints or instrumentation
```
./exp-3_setup_workspace.sh
```

Run experiment

First with tracing

source exp-1_ws/install/setup.bash
ros2 launch overhead/end_to_end_tracing.launch.py

Latency data will be written to latencies_tracing_*.txt

Then without tracing

source exp-3_ws/install/setup.bash
ros2 launch overhead/end_to_end_no-tracing.launch.py

Latency data will be written to latencies_no-tracing_*.txt

Plot results
- Providing the names of the two files
```
cd overhead/
python3 plot_latencies.py latencies_no-tracing_*.txt latencies_tracing_*.txt
```
- A plot will be displayed and exported to a file

Analysis

Download Eclipse Trace Compass
- Install ROS 2 features from the Trace Compass Incubator:
  - Open Trace Compass, click on Help, then Install New Software...
  - Enter the following update site URL: https://download.eclipse.org/tracecompass.incubator/master/repository/
  - Under Trace Types, select Trace Compass ROS 2 (Incubation)
  - Click Next twice, then accept the license terms, and click Finish
  - When prompted, restart Trace Compass
- Or use the provided Dockerfile:
```
docker build --tag tc-incubator tc-incubator/
docker run --net=host -e DISPLAY -v ~/.ros/tracing:/root/.ros/tracing -v ~/.tracecompass:/root/.tracecompass tc-incubator
```
Run Trace Compass
- See the Run (or Debug) the plugins section
- See the Trace Compass User Guide for a full user guide
Import trace and visualize
1. Under File, click on Import...
2. Select the root directory of the trace (system-YYYYMMDDTHHMMSS/)
  - See the experiment instructions for the path to the trace directory
3. Then make sure the trace directory is selected in the filesystem tree view
4. Click on Finish
5. (See also the user guide)
Create experiment (i.e., an aggregation of multiple traces)
1. In the tree view on the left, under Traces, select both traces
2. Then right click, and, under Open As Experiment..., select ROS 2 Experiment
3. (See also the user guide)
(for Autoware reference system experiement) Synchronize traces
- See Synchronize traces in Trace Compass
Open Messages view
- This shows timer & subscription callbacks as well as message publications and receptions over time for each node.
- Arrows also provide links between subscription or timer callbacks and message publications, as well as between message publications and the resulting subscription callback(s).
Navigate and inspect the trace
- Basic controls:
  - Ctrl and mouse wheel up/down to zoom in/out
  - Shift and mouse wheel up/dowm to move left/right
Run Message Flow analysis
1. Click on a segment in the Messages view, i.e., message publication or timer/subscription callback instance
2. Click on the Follow this element button in the top right of the view (hover over buttons to see their description)
3. The analysis will run and should not take much time (less than 5-10 seconds)
Open Message Flow view to view the analysis results
1. Press Ctrl+3 and enter Message Flow
2. In the results below, click on Message Flow (incubator) (ROS 2)
3. The Message Flow view should open

Useful commands

For running experiments on a separate system

Copy experiment directories from remote to local

scp -P $PORT -r $USER@server:/home/$USER/ros2-message-flow-analysis/examples/trace-example-* .

christophebedard/ros2-message-flow-analysis