integration test for stopping target service of proxy feature broken

Question

integration test for stopping target service of proxy feature broken

Closed this issue 6 months ago · 2 comments

Describe the bug

It seems that the proxy-service-stop-service feature is broken in the GH CI:
https://github.com/eclipse-bluechi/bluechi/actions/runs/8138365284/job/22240289714?pr=770#step:8:175

Running tests locally in the contianer setup doesn't lead to a failure.

It receives a timeout while waiting for the requesting.service to start (which resolves the proxy dependency).

Note:
Its currently a bit hard to debug since no journal logs and other artifacts are collected due to the pytest timeout (which can't be caught and its not intended to do so, apparently). Therefore, a small and simple custom implementation of a signal-based timeout might be better in the future.

To Reproduce

Running integration tests in the CI

Expected behavior

Test passes

Answer 1 · 2024-03-04T11:33:52.000Z

It seems there are logs about too many open files:

11:18:34                 out: 2024-03-04 11:18:34+0000,097 DEBUG   [bluechi_test.test] Stopping all BlueChi components in all container... (test:99)                                                                                                          
11:18:34                 out: 2024-03-04 11:18:34+0000,148 DEBUG   [bluechi_test.client] Executed command 'systemctl stop bluechi-agent' with result '0' and output 'b''' (client:84)                                                                         
11:18:34                 out: 2024-03-04 11:18:34+0000,222 DEBUG   [bluechi_test.client] Executed command 'systemctl show --property="Result" bluechi-agent' with result '0' and output 'Result=success' (client:84)                                          
11:18:34                 out: 2024-03-04 11:18:34+0000,293 DEBUG   [bluechi_test.client] Executed command 'systemctl stop bluechi-agent' with result '0' and output 'Failed to allocate directory watch: Too many open files' (client:84)                     
11:18:34                 out: 2024-03-04 11:18:34+0000,368 DEBUG   [bluechi_test.client] Executed command 'systemctl show --property="Result" bluechi-agent' with result '0' and output 'Result=success' (client:84)

So this might be related to podman and the GH hosts in the CI.

Update:
It seems that if this issue occurs, it'll persist quite a while but eventually disappears after restarting the pipeline.

Answer 2 · 2024-03-12T12:52:56.000Z

Should be fixed by #820 feel free to reopen if appears again