Functional Test Runner, test/functional/test_runner.py, intermittently fails on random functional tests.

Question

Functional Test Runner, test/functional/test_runner.py, intermittently fails on random functional tests.

Closed this issue 2 years ago · 4 comments

Describe the issue

When executing the functional test suite via test/functional/test_runner.py --extended, some tests will intermittently fail. However, when running each test in isolation, they will all pass save for test/functional/zmq_test.py which there is another raised issue for..

It does not matter how many threads the test runner is configured to run (e.g. >= 1); all configurations will result in random functional test failures.

What behavior did you expect?

The expected behavior is that regardless of the number of threads the test runner is configured with, all functional tests will pass.

What was the actual behavior (provide screenshots if the issue is GUI-related)?

The actual behavior is that some functional tests will fail intermittently, regardless of the number of threads the test runner is configured with. See here.

How reliably can you reproduce the issue, what are the steps to do so?

This issue can be reproduced on a local machine and via GitHub Actions.

What version of DigiByte Core are you using, where did you get it (website, self-compiled, etc)?

v8.22

What type of machine are you observing the error on (OS/CPU and disk type)?

All machine types can reproduce this issue, but Mac OS 10.15, 11 and 12 were used locally. GitHub Actions with Ubuntu latest were also used.

Answer 1 · 2022-07-10T20:59:10.000Z

It appears this can be fixed by adding a simple "--timeout-factor=" argument when running test_runner.py.
Ex:

test/functional/test_runner.py --jobs=10 --timeout-factor=1.4

It seems BTC devs ran into similar issues with random failures with functional tests. bitcoin/bitcoin#20183

In readme:
https://github.com/DigiByte-Core/digibyte/blob/feature/8.22.0/test/README.md

Often while debugging rpc calls from functional tests, the test might reach timeout before process can return a response. Use --timeout-factor 0 to disable all rpc timeouts for that particular functional test. Ex: test/functional/wallet_hd.py --timeout-factor 0.

In code:
https://github.com/DigiByte-Core/digibyte/blob/feature/8.22.0/test/functional/test_framework/test_framework.py#L194

        parser.add_argument('--timeout-factor', dest="timeout_factor", type=float, default=1.0, help='adjust test timeouts by a factor. Setting it to 0 disables all timeouts')

Answer 2 · 2022-07-11T18:59:59.000Z

Boom! All functional tests pass for me with --timeout-factor=3. Have run it twice now. I think we can close this, you will have to play with it depending on your local dev environment. For me running a Mac with 8 cores and doing other work 3 jobs was the sweet spot.

test/functional/test_runner.py --jobs=3 --timeout-factor=3

Answer 3 · 2022-07-12T01:51:03.000Z

Great work on this @JaredTate ! Did you try these params on the GitHub action runners or do we still need to open a PR for that?

Answer 4 · 2023-03-15T18:19:12.000Z

This issue has been resolved by #106