eclipse/paho.golang

An autopaho test TestQueuedMessage may be not passed.

Opened this issue · 1 comments

Describe the bug
I think this isn't a bug, but it could be reproduced in partical environments especially with low machine resources. So, as a attention, I reported this.

autopaho's main loop of connection manager is terminated in one second at TestQueuedMessage. The part of implemention is the following.

func TestQueuedMessages(t *testing.T) {
...
        // [tomatod's comment] longerDelay is one second
	ctx, cancel := context.WithTimeout(context.Background(), longerDelay)
	defer cancel()
	cm, err := NewConnection(ctx, config)
...
}

So, if your machine couldn't finish the test in one secode, the connection manager would be terminated and test would be failed in result.

To reproduce

  • Environment: Memory: 6GiB / CPU: Intel(R) Pentium(R) Silver N5000 CPU @ 1.10GHz (4 core)
  • Command: go test -coverprofile /tmp/autopaho_coverage.out -race -tags=unittest ./autopaho/ -v -count 1 -run TestQueuedMessages

(I often use this low spec machine outside...)

Debug output
Hire is an example result of the above test command.

...
    test.go:53: 2024-02-17T16:54:18.467392298+09:00 test: innerCtx Done
    test.go:53: 2024-02-17T16:54:18.467846545+09:00 test: disconnecting &{<nil> 0}
    test.go:53: 2024-02-17T16:54:18.468537774+09:00 test: client stopping, incoming stopping
    test.go:53: 2024-02-17T16:54:18.468715317+09:00 test: returning from incoming worker
    test.go:53: 2024-02-17T16:54:18.46899835+09:00 test: returning from publish packets loop worker
    test.go:53: 2024-02-17T16:54:18.469265094+09:00 test: error called: EOF
    test.go:53: 2024-02-17T16:54:18.469104492+09:00 testServer: handleIncoming closed with error handleIncoming:ReadPacket: read tcp 127.0.0.1:46611->127.0.0.1:37874: use of closed network connection Remaining data: []
    test.go:53: 2024-02-17T16:54:18.469471831+09:00 test: returning from ping handler worker
    test.go:63: 2024-02-17T16:54:18.469598315+09:00 testServer:error closing ourConn: close tcp 127.0.0.1:46611->127.0.0.1:37874: use of closed network connection
    test.go:53: 2024-02-17T16:54:18.470503258+09:00 testServer: disconnected
    test.go:53: 2024-02-17T16:54:18.469983564+09:00 test: client stop requested
    test.go:53: 2024-02-17T16:54:18.471257796+09:00 test: conn closed
    test.go:53: 2024-02-17T16:54:18.471494756+09:00 test: acks tracker reset
    test.go:53: 2024-02-17T16:54:18.472294606+09:00 test: session updated, waiting on workers
    test.go:53: 2024-02-17T16:54:18.473805828+09:00 test: workers done
    test.go:63: 2024-02-17T16:54:18.474075876+09:00 test:handleError received extra error: EOF
    test.go:63: 2024-02-17T16:54:18.474838395+09:00 test:mainLoop: disconnect returned error: write tcp 127.0.0.1:37874->127.0.0.1:46611: write: broken pipe
    test.go:63: 2024-02-17T16:54:18.475043068+09:00 test:mainLoop: server connection handler exiting due to context: context deadline exceeded
    test.go:53: 2024-02-17T16:54:18.475369928+09:00 test: mainLoop: connection manager has terminated
    queue_test.go:208: timeout awaiting messages
--- FAIL: TestQueuedMessages (5.55s)

Expected behaviour
Test is desired to pass without depending on environments.

Thanks for raising this,

I'd appreciate it if someone else could run their eyes over these tests; they may be a bit fragile (have seen a very occasional failure) but were the best I could come up with at the time. I wanted to really exercise autopaho well as errors are most likely to surface when it's under stress (and the tests did reveal a couple of issues; since fixed).

I think the actual issue here is the context ctx, cancel := context.WithTimeout(context.Background(), longerDelay); this probably needs to be longer than a second on low powered kit. Would appreciate any thoughts re a better way of handling this (there will always be a potential for false alerts with time based tests like this when resources are very low; currently my only real thought is to increase the timeouts).

M.