Tests do not pass (fails & timeout)
Closed this issue · 7 comments
Accordingly to CI, the tests do not run on regular basis (to have an idea whether that a local issue or a global one). Running tests locally leads to failed checks because of unmet conditions and timeouts (see output below). The same behavior is observed when the tests are run with enabled x64 atomic flag (go test -v -tags amd64_atomic ./...
).
Revision:
~/go-uring$ git rev-parse HEAD
79884120fd55186446a1f4215f31244c51b5cf62
OS:
~/go-uring$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 21.10
Release: 21.10
Codename: impish
Linux:
~/go-uring$ uname -a
Linux ubuntu.localdomain 5.13.0-22-generic #22-Ubuntu SMP Fri Nov 5 13:21:36 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Output:
~/go-uring$ go test -v ./...
? github.com/godzie44/go-uring/example/echo-server [no test files]
? github.com/godzie44/go-uring/example/echo-server-multi-thread [no test files]
? github.com/godzie44/go-uring/example/http-server [no test files]
=== RUN TestListenerAccept
--- PASS: TestListenerAccept (0.61s)
PASS
ok github.com/godzie44/go-uring/net (cached)
=== RUN TestNetworkReactor
=== RUN TestNetworkReactor/TestCancelOperation
net_test.go:90:
Error Trace: net_test.go:90
Error: An error is expected but got nil.
Test: TestNetworkReactor/TestCancelOperation
Messages: operation canceled
=== RUN TestNetworkReactor/TestExecuteWithDeadline
net_test.go:71:
Error Trace: net_test.go:71
Error: Should be true
Test: TestNetworkReactor/TestExecuteWithDeadline
--- FAIL: TestNetworkReactor (1.06s)
--- FAIL: TestNetworkReactor/TestCancelOperation (0.49s)
--- FAIL: TestNetworkReactor/TestExecuteWithDeadline (0.56s)
=== RUN TestRequestID
--- PASS: TestRequestID (0.00s)
=== RUN TestReactor
=== RUN TestReactor/TestCancelOperation
=== RUN TestReactor/TestExecuteWithDeadline
=== RUN TestReactor/TestReactorExecuteReadV
--- PASS: TestReactor (2.11s)
--- PASS: TestReactor/TestCancelOperation (1.04s)
--- PASS: TestReactor/TestExecuteWithDeadline (1.04s)
--- PASS: TestReactor/TestReactorExecuteReadV (0.04s)
=== RUN TestRegistry
--- PASS: TestRegistry (0.57s)
FAIL
FAIL github.com/godzie44/go-uring/reactor 3.739s
=== RUN TestProbe
--- PASS: TestProbe (0.00s)
=== RUN TestIOWQMaxWorkers
register_test.go:42: Skipped, IORING_REGISTER_IOWQ_MAX_WORKERS not supported
--- SKIP: TestIOWQMaxWorkers (0.02s)
=== RUN TestAccept
--- PASS: TestAccept (0.00s)
=== RUN TestAcceptWithSQPoll
--- PASS: TestAcceptWithSQPoll (0.00s)
=== RUN TestAcceptCancel
--- PASS: TestAcceptCancel (0.01s)
=== RUN TestAcceptMany
--- PASS: TestAcceptMany (0.06s)
=== RUN TestAcceptLink
--- PASS: TestAcceptLink (0.20s)
=== RUN TestAcceptAddr
panic: test timed out after 10m0s
goroutine 15 [running]:
testing.(*M).startAlarm.func1()
/usr/lib/go-1.17/src/testing/testing.go:1788 +0x8e
created by time.goFunc
/usr/lib/go-1.17/src/time/sleep.go:180 +0x31
goroutine 1 [chan receive]:
testing.(*T).Run(0xc0000856c0, {0x5d2bde, 0x46b013}, 0x5e3ab0)
/usr/lib/go-1.17/src/testing/testing.go:1307 +0x375
testing.runTests.func1(0xc000095290)
/usr/lib/go-1.17/src/testing/testing.go:1598 +0x6e
testing.tRunner(0xc0000856c0, 0xc00010bd18)
/usr/lib/go-1.17/src/testing/testing.go:1259 +0x102
testing.runTests(0xc000102080, {0x741ec0, 0x18, 0x18}, {0x47c30d, 0x5d2e00, 0x747140})
/usr/lib/go-1.17/src/testing/testing.go:1596 +0x43f
testing.(*M).Run(0xc000102080)
/usr/lib/go-1.17/src/testing/testing.go:1504 +0x51d
main.main()
_testmain.go:89 +0x14b
goroutine 13 [syscall]:
syscall.Syscall6(0x1aa, 0xf, 0x0, 0x1, 0x1, 0x0, 0x8)
/usr/lib/go-1.17/src/syscall/asm_linux_amd64.s:43 +0x5
github.com/godzie44/go-uring/uring.sysEnter2(0xc00019c220, 0x612280, 0x0, 0x1a0340, 0x0, 0x0, 0x0)
/home/vagrant/go-uring/uring/syscall.go:68 +0xa5
github.com/godzie44/go-uring/uring.(*Ring).getCQEvents(0xc00019c220, {0x0, 0x1, 0x0, 0x0, 0x8})
/home/vagrant/go-uring/uring/ring.go:316 +0x10b
github.com/godzie44/go-uring/uring.(*Ring).WaitCQEvents(...)
/home/vagrant/go-uring/uring/ring.go:380
github.com/godzie44/go-uring/uring.TestAcceptAddr(0xc0001a0340)
/home/vagrant/go-uring/uring/ring_accept_test.go:358 +0x2d2
testing.tRunner(0xc0001a0340, 0x5e3ab0)
/usr/lib/go-1.17/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
/usr/lib/go-1.17/src/testing/testing.go:1306 +0x35a
goroutine 14 [chan send]:
github.com/godzie44/go-uring/uring.dial(0xc0001a0340, 0x5e3ab0)
/home/vagrant/go-uring/uring/ring_accept_test.go:23 +0x95
created by github.com/godzie44/go-uring/uring.TestAcceptAddr
/home/vagrant/go-uring/uring/ring_accept_test.go:349 +0x1b4
FAIL github.com/godzie44/go-uring/uring 600.148s
FAIL
Thanks for participating
Accordingly to CI, the tests do not run on regular basis (to have an idea whether that a local issue or a global one)
It's not a global issue, I had some trouble with github CI on any io_uring syscall, so local tests must be good
About your test results:
TestAcceptAddr
This and any network test use hardcoded address localhost:8080, can you check that this port is not in use on test running? I think this may be a problem
TestCancelOperation and TestExecuteWithDeadline
This test uses 8080 too, please try to check these ports are not in use when tests start.
Also, can you try to run all tests sequentially with -p 1?
If my assumptions are confirmed, I will try to make the necessary edits to conveniently run tests excluding port conflicts ASAP.
Also please make sure that io_uring support IORING_FEAT_EXT_ARG feature, it totally require for reactors correct work (to be more precise for the correct work of timeouts and cancellations)
This and any network test use hardcoded address localhost:8080, can you check that this port is not in use on test running? I think this may be a problem
That is completely clean environment. So I double-checked ports open anyway:
vagrant@ubuntu:~/go-uring$ sudo netstat -putan
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN 557/systemd-resolve
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 661/sshd: /usr/sbin
tcp 0 0 10.0.2.15:60248 140.82.121.4:443 TIME_WAIT -
tcp 0 44 10.0.2.15:22 10.0.2.2:55332 ESTABLISHED 6691/sshd: vagrant
tcp6 0 0 :::22 :::* LISTEN 661/sshd: /usr/sbin
udp 0 0 127.0.0.53:53 0.0.0.0:* 557/systemd-resolve
udp 0 0 10.0.2.15:68 0.0.0.0:* 555/systemd-network
vagrant@ubuntu:~/go-uring$ python3 -c 'import socket; socket.socket().connect(("localhost", 8080))'
Traceback (most recent call last):
File "<string>", line 1, in <module>
ConnectionRefusedError: [Errno 111] Connection refused
Also please make sure that io_uring support IORING_FEAT_EXT_ARG feature, it totally require for reactors correct work (to be more precise for the correct work of timeouts and cancellations)
As far as I understand, accordingly to the check, the test should fail explicitly with a corresponding error. Or some other hidden traps are possible?
Also, can you try to run all tests sequentially with -p 1?
And here a weird thing comes on the scene. With -p 1
all the tests pass, and when the tests are re-run again but concurrently, they do pass again and again (sic!). Once you delete go
's cache, an original issue could be observed. I guess that is kind of hidden trap :).
Thanks! This behavior looks like parallel test conflicts cause they try to use the same ports, I think need some fix of it (detect free port on every test?). Also, the original liburing test's run sequentially (and maybe I ported them too literally).
I was able to reproduce these issue in my local env, and unique ports solve this problem. Would be great if you check it on your environment now.