:farmer: test_rosbag2_play_end_to_end flaky on Windows
Crola1702 opened this issue · 10 comments
Description
Flaky test test_rosbag2_play_end_to_end in windows CI (debug, release and repeated)
Test regressions:
- projectroot.test_rosbag2_play_end_to_end
- rosbag2_tests.test_rosbag2_play_end_to_end.gtest.missing_result
Expected Behavior
Test should pass
Actual Behavior
Test failing because of a timeout
To Reproduce
- Run a build in nightly_win_deb, nightly_win_rel or nightly_win_rep
- See projectroot.test_rosbag2_play_end_to_end fail
System (please complete the following information)
- OS: Windows
- ROS 2 Distro: Rolling
- Install Method: Source
- Version: Rolling latest
Additional context
Reference build: https://ci.ros2.org/view/nightly/job/nightly_win_deb/2813/
Test regressions:
- projectroot.test_rosbag2_play_end_to_end
- rosbag2_tests.test_rosbag2_play_end_to_end.gtest.missing_result
Test is failing because of a timeout:
Log output:
[WARN] [1691491268.193064100] [subscription_manager_1691491266065392000]: SubscriptionManager::continue_spinning(..) finished by timeout
C:\ci\ws\src\ros2\rosbag2\rosbag2_tests\test\rosbag2_tests\test_rosbag2_play_end_to_end.cpp(80): error: Expected equality of these values:
future.wait_for(service_call_timeout_)
Which is: 4-byte object <01-00 00-00>
std::future_status::ready
Which is: 4-byte object <00-00 00-00>
1/11 Test #2: test_rosbag2_play_end_to_end .....***Timeout 60.00 sec
Test gets stuck when the error pops up (normally takes 15 seconds to run)
Flakiness ratio (last 15 days)
job_name | last_time | first_time | build_count | failure_count | failure_percentage |
---|---|---|---|---|---|
nightly_win_rep | 2023-08-16 | 2023-08-02 | 15 | 13 | 86.67 |
nightly_win_rel | 2023-08-16 | 2023-08-02 | 15 | 14 | 93.33 |
nightly_win_deb | 2023-08-16 | 2023-08-02 | 15 | 13 | 86.67 |
Updated 17-08-2023
Running a diff between ros2 repos in nightly_win_deb#2806 and nightly_win_deb#2807:
@@ -36,7 +36,7 @@ repositories:
eProsima\Fast-DDS:
type: git
url: https://github.com/eProsima/Fast-DDS.git
- version: 7cf43a62cabc3124721258f02c9257f451dd1971
+ version: 9ae27f174c1a33bf539c589ce2f1d6630052d1b0
eProsima\foonathan_memory_vendor:
type: git
url: https://github.com/eProsima/foonathan_memory_vendor.git
@@ -208,7 +208,7 @@ repositories:
ros2\mimick_vendor:
type: git
url: https://github.com/ros2/mimick_vendor.git
- version: 6fcd465251c1e62b1dddabf6607712da8a141a2c
+ version: 24f0be689e525dbcf18cce910ac12bae26c07e3c
ros2\orocos_kdl_vendor:
type: git
url: https://github.com/ros2/orocos_kdl_vendor.git
@@ -244,7 +244,7 @@ repositories:
ros2\rclpy:
type: git
url: https://github.com/ros2/rclpy.git
- version: 913afa019b0d60e7255bd84247dbeab4735d96ea
+ version: 5367703c9812680a563fb904b8c9d187da310bd3
ros2\rcpputils:
type: git
url: https://github.com/ros2/rcpputils.git
@@ -264,7 +264,7 @@ repositories:
ros2\rmw_connextdds:
type: git
url: https://github.com/ros2/rmw_connextdds.git
- version: 1206113b6eff6d2a6557911866548ff4415c6852
+ version: b57d032cad10a68f0cb0d349a715b9e3c4475cf4
ros2\rmw_cyclonedds:
type: git
url: https://github.com/ros2/rmw_cyclonedds.git
@@ -300,7 +300,7 @@ repositories:
ros2\rosbag2:
type: git
url: https://github.com/ros2/rosbag2.git
- version: ba199d05954d6e51975c47acd3725cac6267778f
+ version: fda5aea264b4183d94352a7c7caf266ae104aa52
First time this issue was seen:
@clalancette, this is happening almost consistently in Windows builds. Do you think it's a good time to disable this test while #1342 is resolved? (it seems it would be a long time until we have updates on the fix PR)
@Crola1702 @clalancette I guess it hasn't been fully fixed for Windows after my overhaul in player tests per #1297.
I will try to take a look at it this week.
If I will not find a solution we can disable it for Windows again.
BTW. I have come across with this failure in one of my recent PR #1423 (comment).
My preliminary analysis was:
Failed PlayEndToEndTestFixture.play_filters_by_topic by timeout since was not able to receive confirmation about service call for player resume in 60 seconds.
BTW. I have come across with this failure in one of my recent PR #1423 (comment).
I haven't seen this failure in the buildfarm 🤔