`ci_agent` in `build.ros2.org` has unexpected running nodes in background and causes test regressions
Opened this issue · 1 comments
Crola1702 commented
Migrated from https://github.com/osrf/buildfarmer/issues/337 on Aug 22, 2022
Description
3 unexpected nodes in this agent are causing test regressions on Humble
Reference builds:
- Humble Debug 25 test regressions
- Humble Debug 23 test regressions
- Humble FastRTPS 11 test regressions
Failing tests:
Example test
Test: rcl.TestGetNodeNames__rmw_fastrtps_cpp.test_rcl_get_node_names (from TestGetNodeNames__rmw_fastrtps_cpp)
Stacktrace:
/tmp/ws/src/ros2/rcl/rcl/test/rcl/test_get_node_names.cpp:138
Expected equality of these values:
discovered_nodes
Which is: { ("demo_node_0", "/"), ("demo_node_1", "/"), ("demo_node_2", "/"), ("node1", "/"), ("node1", "/"), ("node2", "/"), ("node2", "/ns/ns"), ("node3", "/ns") }
expected_nodes
Which is: { ("node1", "/"), ("node1", "/"), ("node2", "/"), ("node2", "/ns/ns"), ("node3", "/ns") }
There are 3 unexpected nodes on this test:
demo_node_0
,demo_node_1
, anddemo_node_2
.
Explanation
- We tracked errors throughout the log and found test were failing because of 3 nodes that appeared to be created since the start of the build
- We tried to replicate this error in
ci.ros2.org
: - As we couldn't replicate the error in
ci.ros2.org
we think this agent had issues destroying those 3 nodes. - We think this error may be related to processes not being closed after a failure (even with Docker).
22/08 Update
- This Humble build ran on a different agent and passed
Crola1702 commented
Commented by cottsay on Aug 22, 2022
From the troubled node (ci_agent-ffcf5120), this is the container that's still running:
f9566d9966832ea21871dff4f7a10745ad058d74c5c9545eb79632e6e5fee119 1660307708.398302425.ci_build_and_test.rolling "sh -c 'PATH=/usr/lib/ccache:$PATH PYTHONPATH=/tmp/ros_buildfarm:$PYTHONPATH python3 -u /tmp/ros_buildfarm/scripts/devel/build_and_test.py --rosdistro-name rolling --ros-version 2 --build-tool colcon --workspace-root /tmp/ws --parent-result-space --build-tool-args --cmake-args -DCMAKE_BUILD_TYPE=Release -DSKIP_MULTI_RMW_TESTS=1 --no-warn-unused-cli --build-tool-test-args --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m \"not xfail\"'"
jenkins+ 2883215 0.0 0.0 2888 96 ? S Aug12 0:00 /bin/sh -c PYTHONIOENCODING=utf_8 PYTHONUNBUFFERED=1 colcon test --build-base build_isolated --install-base install_isolated --test-result-base test_results --event-handlers console_direct+ --executor sequential --test-result-base /tmp/ws/test_results --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail"
jenkins+ 2883216 0.0 0.7 224048 57720 ? Sl Aug12 13:43 /usr/bin/python3 /usr/bin/colcon test --build-base build_isolated --install-base install_isolated --test-result-base test_results --event-handlers console_direct+ --executor sequential --test-result-base /tmp/ws/test_results --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m not xfail
jenkins+ 2994922 0.1 0.6 922448 50136 ? Sl Aug12 27:04 /usr/bin/python3 -m pytest
jenkins+ 2994957 0.1 0.8 709136 66904 ? Sl Aug12 19:16 /tmp/ws/install_isolated/demo_nodes_cpp/lib/demo_nodes_cpp/talker --ros-args -r __node:=demo_node_0
jenkins+ 2994959 0.0 0.8 708968 66648 ? Sl Aug12 11:07 /tmp/ws/install_isolated/demo_nodes_cpp/lib/demo_nodes_cpp/talker --ros-args -r __node:=demo_node_1
jenkins+ 2994961 0.0 0.8 709148 66272 ? Sl Aug12 11:14 /tmp/ws/install_isolated/demo_nodes_cpp/lib/demo_nodes_cpp/talker --ros-args -r __node:=demo_node_2