/ros2_tracing

Tracing tools for ROS 2.

Primary LanguagePythonApache License 2.0Apache-2.0

ros2_tracing

GitHub CI codecov

Tracing tools for ROS 2.

Overview

ros2_tracing provides tracing instrumentation for the core ROS 2 packages. It also provides tools to configure tracing through a launch action and a ros2 CLI command.

ros2_tracing currently only supports the LTTng tracer. Consequently, it currently only supports Linux.

Note: make sure to use the right branch, depending on the ROS 2 distro: use rolling for Rolling, galactic for Galactic, etc.

Publications & presentations

Read the ros2_tracing paper! If you use or refer to ros2_tracing, please cite:

  • C. Bédard, I. Lütkebohle, and M. Dagenais, "ros2_tracing: Multipurpose Low-Overhead Framework for Real-Time Tracing of ROS 2," IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 6511–6518, 2022.

    BibTeX
    @article{bedard2022ros2tracing,
      title={ros2\_tracing: Multipurpose Low-Overhead Framework for Real-Time Tracing of ROS 2},
      author={B{\'e}dard, Christophe and L{\"u}tkebohle, Ingo and Dagenais, Michel},
      journal={IEEE Robotics and Automation Letters},
      year={2022},
      volume={7},
      number={3},
      pages={6511--6518},
      doi={10.1109/LRA.2022.3174346}
    }

This other paper leverages ros2_tracing to analyze and visualize the flow of messages across distributed ROS 2 systems:

  • C. Bédard, P.-Y. Lajoie, G. Beltrame, and M. Dagenais, "Message Flow Analysis with Complex Causal Links for Distributed ROS 2 Systems," Robotics and Autonomous Systems, vol. 161, p. 104361, 2023.

    BibTeX
    @article{bedard2023messageflow,
      title={Message flow analysis with complex causal links for distributed {ROS} 2 systems},
      author={B{\'e}dard, Christophe and Lajoie, Pierre-Yves and Beltrame, Giovanni and Dagenais, Michel},
      journal={Robotics and Autonomous Systems},
      year={2023},
      volume={161},
      pages={104361},
      doi={10.1016/j.robot.2022.104361}
    }

Finally, check out the following presentations:

  • ROSCon 2023: "Improving Your Application's Algorithms and Optimizing Performance Using Trace Data" (video, slides)
  • ROS World 2021: "Tracing ROS 2 with ros2_tracing" (video, slides)

Tutorials & demos

Building

Starting from ROS 2 Iron Irwini, the LTTng tracer is a ROS 2 dependency. Therefore, ROS 2 can be traced out-of-the-box on Linux; this package does not need to be re-built. The following rmw implementations are supported:

  • rmw_connextdds
  • rmw_cyclonedds_cpp
  • rmw_fastrtps_cpp
  • rmw_fastrtps_dynamic_cpp

To make sure that the instrumentation and tracepoints are available:

$ source /opt/ros/rolling/setup.bash  # With a binary install
$ source ./install/setup.bash  # When building from source
$ ros2 run tracetools status
Tracing enabled

A ROS 2 installation only includes the LTTng userspace tracer (LTTng-UST), which is all that is needed to trace ROS 2. To trace the Linux kernel, the LTTng kernel tracer must be installed separately:

$ sudo apt-get update
$ sudo apt-get install lttng-modules-dkms

For more information about LTTng, refer to its documentation.

Removing the instrumentation

To build and remove all instrumentation, use TRACETOOLS_DISABLED:

$ colcon build --cmake-args -DTRACETOOLS_DISABLED=ON

This will remove all instrumentation from the core ROS 2 packages, and thus they will not depend on or link against the shared library provided by the tracetools package. This also means that LTTng is not required at build-time or at runtime.

Excluding tracepoints

Alternatively, to only exclude the actual tracepoints, use TRACETOOLS_TRACEPOINTS_EXCLUDED:

$ colcon build --packages-select tracetools --cmake-clean-cache --cmake-args -DTRACETOOLS_TRACEPOINTS_EXCLUDED=ON

This will keep the instrumentation but remove all tracepoints. This also means that LTTng is not required at build-time or at runtime. This option can be useful, since tracepoints can be added back in or removed by simply replacing/re-building the shared library provided by the tracetools package.

Tracing

By default, trace data will not be generated, and thus these packages will have virtually no impact on execution. LTTng has to be configured for tracing. The packages in this repo provide two options: a command and a launch file action.

Note: tracing must be started before the application is launched. Metadata is recorded during the initialization phase of the application. This metadata is needed to understand the rest of the trace data, so if tracing is started after the application started executing, then the trace data might be unusable. For more information, refer to the design document. The launch file action is designed to automatically start tracing before the application launches.

The tracing directory can be configured using command/launch action parameters, or through environment variables with the following logic:

  • Use $ROS_TRACE_DIR if ROS_TRACE_DIR is set and not empty.
  • Otherwise, use $ROS_HOME/tracing, using ~/.ros for ROS_HOME if not set or if empty.

Additionally, if you're using kernel tracing with a non-root user, make sure that the tracing group exists and that your user is added to it.

# Create group if it doesn't exist
$ sudo groupadd -r tracing
# Add user to the group
$ sudo usermod -aG tracing $USER

Trace command

The first option is to use the ros2 trace command.

$ ros2 trace

By default, it will enable all ROS 2 tracepoints. The trace will be written to ~/.ros/tracing/session-YYYYMMDDHHMMSS. Run the command with -h for more information.

The ros2 trace command requires user interaction to start and then stop tracing. To trace without user interaction (e.g., in scripts), or for finer-grained tracing control, the following sub-commands can be used:

$ ros2 trace start session_name   # Configure tracing session and start tracing
$ ros2 trace pause session_name   # Pause tracing after starting
$ ros2 trace resume session_name  # Resume tracing after pausing
$ ros2 trace stop session_name    # Stop tracing after starting or resuming

Run each command with -h for more information.

You must install the kernel tracer if you want to enable kernel events (using the -k/--kernel-events option) or syscalls (using the --syscalls option). If you have installed the kernel tracer, use kernel tracing, and still encounter an error here, make sure to add your user to the tracing group.

Launch file trace action

Another option is to use the Trace action in a Python, XML, or YAML launch file along with your Node action(s). This way, tracing automatically starts when launching the launch file and ends when it exits or when terminated.

$ ros2 launch tracetools_launch example.launch.py

The Trace action will also set the LD_PRELOAD environment to preload LTTng's userspace tracing helper(s) if the corresponding event(s) are enabled. For more information, see this example launch file and the Trace action.

You must install the kernel tracer if you want to enable kernel events (events_kernel in Python, events-kernel in XML or YAML) or syscalls (syscalls in Python, XML, or YAML). If you have installed the kernel tracer, use kernel tracing, and still encounter an error here, make sure to add your user to the tracing group.

Design

See the design document.

Real-time

LTTng-UST, the current default userspace tracer used for tracing ROS 2, was designed for real-time production applications. It is a low-overhead tracer with many important real-time compatible features:

  • userspace tracer completely implemented in userspace, independent from the kernel
  • reentrant, thread-safe, signal-safe, non-blocking
  • no system calls in the fast path
  • no copies of the trace data

However, some settings need to be tuned for it to be fully real-time safe and for performance to be optimal for your use-case:

  • timers1: use read timer to avoid a write(2) call
  • sub-buffer1 count and size:
    • see documentation for sub-buffer count and size tuning tips based on your use-case
    • minimize sub-buffer count to minimize sub-buffer switching overhead
  • one-time memory allocation/lock/syscall per thread:
    • usually done the first time a tracepoint is executed within a thread for URCU thread registration, but registration can be manually performed to force it to be done during your application's initialization
    • see this LTTng mailing list message

For further reading:

The LTTng kernel tracer has a similar implementation, but is separate from the userspace tracer.

Packages

lttngpy

Package containing liblttng-ctl Python bindings.

ros2trace

Package containing a ros2cli extension to enable tracing.

tracetools

Library to support instrumenting ROS packages, including core packages.

This package claims to be in the Quality Level 1 category, see the Quality Declaration for more details.

See the API documentation.

tracetools_launch

Package containing tools to enable tracing through launch files.

tracetools_read

Package containing tools to read traces.

tracetools_test

Package containing tools for tracing-related tests.

tracetools_trace

Package containing tools to enable tracing.

test_ros2trace

Package containing system tests for ros2trace.

test_tracetools

Package containing unit and system tests for tracetools.

test_tracetools_launch

Package containing system tests for tracetools_launch.

Analysis

See tracetools_analysis.

Footnotes

  1. this setting cannot currently be set through the Trace launch file action or the ros2 trace command, see #20 2