relab/hotstuff

eventloop: TestHandler is flaky; possible deadlock

Closed this issue · 1 comments

$ go test -run TestHandler -count=20000
--- FAIL: TestHandler (1.00s)
    eventloop_test.go:31: timed out
--- FAIL: TestHandler (1.00s)
    eventloop_test.go:31: timed out
--- FAIL: TestHandler (1.00s)
    eventloop_test.go:31: timed out
--- FAIL: TestHandler (1.00s)
    eventloop_test.go:31: timed out
FAIL
exit status 1
FAIL	github.com/relab/hotstuff/eventloop	4.776s

Sometimes tests pass just fine (and finishes quickly), but running many times you should be able to reproduce. It does not help to increase the context's timeout, so I suspect this is a deadlock because non-failing test executions finish fast.

I discovered this when upgrading the dependencies in PR #118, which itself doesn't touch the event loop. However, it reproduces also in master per commit 6c1fcb7b4b413.

Adding the following line:

	go el.Run(ctx)
+	time.Sleep(1 * time.Millisecond) // wait for the event loop to start

Seems to resolve the problem:

$ go test -run TestHandler -count=200000
PASS
ok  	github.com/relab/hotstuff/eventloop	237.785s