TestStartup is flaky
Closed this issue · 4 comments
During #259, failed in https://github.com/ssbc/go-ssb/actions/runs/3725019672/jobs/6317671995:
github.com/ssbc/go-ssb/tests
2022/12/18 13:41:52 [margaret/indexes/badger] overwrote batch limit 0
2022/12/18 13:41:52 saved identity @YloRx1QFtJiWa0etvN9iN3pVIS6rFuOe7MKTNiS/Mko=.ed25519 to testrun/TestBlobToJS/secret
TAP version 13TestBlobToJS0
sbot spawned, running before
dialing:net:127.0.0.1:40309~shs:YloRx1QFtJiWa0etvN9iN3pVIS6rFuOe7MKTNiS/Mko=
ok 1 connected
level=debug t=441.099408ms plugin=ebt event=replicating version=3
ok 2 got blob
ok 3 no errclosed sbot
1..3
tests 3
pass 3
ok
level=debug t=1.453971889s plugin=ebt r=<@6VgT.ed25519> event="loop exited"
JS Sbot process returned
level=debug t=1.454212993s event="network listen loop exited"
level=warn t=1.454244393s unit=gossip event="live qry on rxlog exited"
level=debug t=1.454279094s event="sbot closing" msg="connections closed"
level=debug t=1.454285894s event="sbot closing" msg="waited for indexes to close"
level=info t=1.460750401s event="sbot closing" msg="closers closed"
2022/12/18 13:41:54 saved identity @UfjRJPAtDhepXvxQxUjYse6dDwpiVbXhXAF1z47llQ0=.ed25519 to testrun/TestBlobFromJS/secret
TAP version 13TestBlobFromJS1
sbot spawned, running before
ok 1 added err
ok 2 blob iddialing:net:127.0.0.1:46159~shs:UfjRJPAtDhepXvxQxUjYse6dDwpiVbXhXAF1z47llQ0=
ok 3 connected
ok 4 should have blob
ok 5 has err
level=debug t=428.489599ms plugin=ebt event=replicating version=3
level=info t=515.921547ms module=WantManager event=blobs.get ref=<&w6uP.sha256> msg=stored ref=<&w6uP.sha256> sz=6closed sbot
1..5
tests 5
pass 5
ok
JS Sbot process returned
level=debug t=1.899074662s event="network listen loop exited"
level=warn t=1.899110563s unit=gossip event="live qry on rxlog exited"
level=debug t=1.899167364s plugin=ebt r=<@cw5U.ed25519> event="loop exited"
t=1.899239765s event=warning msg="still open connections" count=1
level=debug t=1.899321566s event="sbot closing" msg="connections closed"
level=debug t=1.899340767s event="sbot closing" msg="waited for indexes to close"
level=info t=1.909133029s event="sbot closing" msg="closers closed"
2022/12/18 13:41:55 saved identity @BarQLh9J29zZ/kzZWE9iM9s6/fyMiMlJRb3BBTsjHoA=.ed25519 to testrun/TestBlobWithHop/secret
TAP version 13TestBlobWithHop2
sbot spawned, running before
ok 1 null
got hash:&X5jnorJyi0ta5WEFx7DQU8hbef/RYA0qkCZ4ptn6FX8=.sha256
ok 2 null
size12
ok 3 null
leaked blob addr in:%fq9lnY1igTYW6CgP/mkVFiGSElIJCYrCd4N4Kb+ENqA=.sha256
dialing:net:127.0.0.1:40383~shs:BarQLh9J29zZ/kzZWE9iM9s6/fyMiMlJRb3BBTsjHoA=
ok 4 connected
level=debug t=935.191294ms plugin=ebt event=replicating version=3
TAP version 13TestBlobWithHop3
sbot spawned, running before
dialing:net:127.0.0.1:40383~shs:BarQLh9J29zZ/kzZWE9iM9s6/fyMiMlJRb3BBTsjHoA=
ok 1 connected
level=debug t=1.394880097s plugin=ebt event=replicating version=3
level=info t=1.483094955s module=WantManager event=blobs.get ref=<&X5jn.sha256> msg=stored ref=<&X5jn.sha256> sz=12
level=debug t=1.483313859s module=WantManager remote=<@fyCV.ed25519> event=blobStoreNotify op=put ref=<&X5jn.sha256> cause="broadcasting received blob" sz=12
ok 2 want err?closed sbot
1..2
tests 2
pass 2
ok
JS Sbot process returned
level=debug t=1.493595029s event="network listen loop exited"
level=warn t=1.493616829s unit=gossip event="live qry on rxlog exited"
level=debug t=1.493686431s plugin=ebt r=<@fyCV.ed25519> event="loop exited"
level=debug t=1.493806033s plugin=ebt r=<@n3Ym.ed25519> event="loop exited"
t=1.493856533s event=warning msg="still open connections" count=1
level=debug t=1.493882234s event="sbot closing" msg="connections closed"
level=debug t=1.493888434s event="sbot closing" msg="waited for indexes to close"closed sbot
1..4
tests 4
pass 4
ok
level=info t=1.499216822s event="sbot closing" msg="closers closed"
JS Sbot process returned
2022/12/18 13:41:57 saved identity @QZBgS5r8fCjK3Pg8P8CFkqNa1YcCl6Hub1O7Avgn0Vo=.ed25519 to testrun/TestBlobTooBigWantedByJS/secret
TAP version 13TestBlobTooBigWantedByJS4
sbot spawned, running before
dialing:net:127.0.0.1:35133~shs:QZBgS5r8fCjK3Pg8P8CFkqNa1YcCl6Hub1O7Avgn0Vo=
level=warn t=432.807356ms event="no stored feeds - attempting re-sync with trust-on-first-use"
ok 1 connected
level=debug t=481.102755ms plugin=ebt event=replicating version=3
ok 2 got small blob
ok 3 no err
no stream for incoming msg { req: -3, stream: true, end: true, value: true, length: 4, type: 2 }closed sbot
1..3
tests 3
pass 3
ok
JS Sbot process returned
level=debug t=6.876755303s event="network listen loop exited"
level=warn t=6.876812004s unit=gossip event="live qry on rxlog exited"
level=debug t=6.876824104s plugin=ebt r=<@PUfS.ed25519> event="loop exited"
level=debug t=6.876852205s event="sbot closing" msg="connections closed"
level=debug t=6.876966507s event="sbot closing" msg="waited for indexes to close"
level=info t=6.882398296s event="sbot closing" msg="closers closed"
2022/12/18 13:42:04 saved identity @d4O7LuzDaSA9uCYTwNG+AleCr/awmHU24iqpIAm02Ys=.ed25519 to testrun/TestBlobTooBigWantedByGo/secret
TAP version 13TestBlobTooBigWantedByGo5
sbot spawned, running before
ok 1 null
got hash:&SqtVEGDZDsEI53s3k9lHORXAgFjYmRK7pFDcAuYoo2c=.sha256
ok 2 null
got hash:&BcIXzoawDrJrq7ETyZxLmbL/cJzPnRCu76l5Qlgw1T4=.sha256
ok 3 null
published test-data
dialing:net:127.0.0.1:34865~shs:d4O7LuzDaSA9uCYTwNG+AleCr/awmHU24iqpIAm02Ys=
ok 4 connected
level=debug t=918.873312ms plugin=ebt event=replicating version=3
level=info t=1.590924069s module=WantManager event=blobs.get ref=<&SqtV.sha256> msg=stored ref=<&SqtV.sha256> sz=5242870
level=debug t=3.963710262s event=update-replicate feed-want-count=1 hops=0 took=75.801µs
level=debug t=6.687413781s plugin=ebt r=<@IH8a.ed25519> event="loop exited"closed sbot
1..4
tests 4
pass 4
ok
JS Sbot process returned
level=debug t=6.693969089s event="network listen loop exited"
level=warn t=6.69400749s unit=gossip event="live qry on rxlog exited"
level=debug t=6.694031191s event="sbot closing" msg="connections closed"
level=debug t=6.694037891s event="sbot closing" msg="waited for indexes to close"
level=info t=6.698856371s event="sbot closing" msg="closers closed"
2022/12/18 13:42:11 saved identity @9FqRPwt4x8lWOpPAT4wrCVUKjDkXrELZCr/X4PBs+UQ=.ed25519 to testrun/TestFeedFromJS/secret
TAP version 13claire
sbot spawned, running before
closed sbot
1..0
tests 0
pass 0
ok
JS Sbot process returned
TAP version 13alice
sbot spawned, running before
ok 1 parallel of publish
ok 2 message countdialing:net:127.0.0.1:38401~shs:9FqRPwt4x8lWOpPAT4wrCVUKjDkXrELZCr/X4PBs+UQ=
ok 3 connected
level=debug t=1.300164886s plugin=ebt event=replicating version=3
level=debug t=4.354739451s event=update-replicate feed-want-count=1 hops=0 took=133.302µs
level=debug t=6.462802508s plugin=ebt r=<@31RM.ed25519> event="loop exited"closed sbot
1..3
tests 3
pass 3
ok
JS Sbot process returned
TAP version 13claire
sbot spawned, running before
shouldnt have alices feed:@31RMZu7FBk3d4P1QJZW+MjwkXF/4L7Vk5zBjUopOWT4=.ed25519
ok 1 follow:@31RMZu7FBk3d4P1QJZW+MjwkXF/4L7Vk5zBjUopOWT4=.ed25519
ssb-friends: get legacy api used
ok 2 friends.get of new contact
ok 3 is following
ok 4 null
ok 5 should be strictly equaldialing:net:127.0.0.1:38401~shs:9FqRPwt4x8lWOpPAT4wrCVUKjDkXrELZCr/X4PBs+UQ=
ok 6 connected
level=debug t=7.379768314s plugin=ebt event=replicating version=3
level=debug t=7.424821861s unit=gossip fr=<@31RM.ed25519> event=gossiptx n=24 starting=0
level=debug t=7.484752854s unit=gossip fr=<@31RM.ed25519> event=gossiptx n=24 starting=0
level=debug t=10.426117492s event=update-replicate feed-want-count=2 hops=0 took=61.701µs
level=debug t=12.539685061s plugin=ebt r=<@L1Gy.ed25519> event="loop exited"now should have feed:@31RMZu7FBk3d4P1QJZW+MjwkXF/4L7Vk5zBjUopOWT4=.ed25519
ok 7 query worked
ok 8 got all the messages
ok 9 latest sequence
ok 10 latest keys match
ok 11 latest sequenceclosed sbot
1..11
tests 11
pass 11
ok
JS Sbot process returned
level=warn t=12.552339648s unit=gossip event="live qry on rxlog exited"
level=debug t=12.552352348s event="network listen loop exited"
level=debug t=12.552389649s event="sbot closing" msg="connections closed"
level=debug t=12.552411749s event="sbot closing" msg="waited for indexes to close"
level=info t=12.557356222s event="sbot closing" msg="closers closed"
2022/12/18 13:42:23 saved identity @fEr1XSmhnfc5HBOjrMIByuO4InYIspEVuBi8ID1Al24=.ed25519 to testrun/TestFeedFromGoNotLive/secret
TAP version 13TestFeedFromGoNotLive9
sbot spawned, running before
ok 1 follow:@fEr1XSmhnfc5HBOjrMIByuO4InYIspEVuBi8ID1Al24=.ed25519
ssb-friends: get legacy api used
ok 2 friends.get of new contact
ok 3 is followingshouldnt have bobs feed:@fEr1XSmhnfc5HBOjrMIByuO4InYIspEVuBi8ID1Al24=.ed25519
ok 4 null
ok 5 should be strictly equal
ok 6 about:%MfcrG5hI0LbGzI0C6md9lMyufMvcEwQSUCP4GFrvmTk=.sha256dialing:net:127.0.0.1:44685~shs:fEr1XSmhnfc5HBOjrMIByuO4InYIspEVuBi8ID1Al24=
ok 7 connected
level=debug t=1.978564472s unit=gossip fr=<@fEr1.ed25519> event=gossiptx n=3 starting=0
level=debug t=1.979301283s plugin=gossip event=gossiprx fr=<@n6kD.ed25519> starting=0 received=2 took=44.536259ms
level=debug t=4.97996002s event=update-replicate feed-want-count=1 hops=0 took=163.903µsnow should have feed:@fEr1XSmhnfc5HBOjrMIByuO4InYIspEVuBi8ID1Al24=.ed25519
ok 8 query worked
ok 9 got all the messages
ok 10 sequence:0
ok 11 sequence:1
ok 12 sequence:2closed sbot
1..12
tests 12
pass 12
ok
JS Sbot process returned
level=debug t=7.105659444s event="network listen loop exited"
level=warn t=7.105696444s unit=gossip event="live qry on rxlog exited"
level=debug t=7.105711745s event="sbot closing" msg="connections closed"
level=debug t=7.105715345s event="sbot closing" msg="waited for indexes to close"
level=info t=7.110856329s event="sbot closing" msg="closers closed"
level=warn t=7.129209029s unit=gossip event="live qry on rxlog exited"
level=debug t=7.12922813s event="network listen loop exited"
level=debug t=7.12925033s event="sbot closing" msg="connections closed"
level=debug t=7.12925603s event="sbot closing" msg="waited for indexes to close"
level=info t=7.13411031s event="sbot closing" msg="closers closed"
2022/12/18 13:42:30 saved identity @5Zqfbx1bbiWfBchFnEX9a4bL3/ni4w7mwgNpzpFzGe8=.ed25519 to testrun/TestFeedFromGoLive/secret
TAP version 13TestFeedFromGoLive10
sbot spawned, running before
ok 1 follow:@5Zqfbx1bbiWfBchFnEX9a4bL3/ni4w7mwgNpzpFzGe8=.ed25519
ssb-friends: get legacy api used
ok 2 friends.get of new contact
ok 3 is followingshouldnt have bobs feed:@5Zqfbx1bbiWfBchFnEX9a4bL3/ni4w7mwgNpzpFzGe8=.ed25519
ok 4 null
ok 5 should be strictly equal
ok 6 about:%ckTv3CvvFE/w+pnp6WJBvhGV7Je/yZOQfAhB5robRzY=.sha256dialing:net:127.0.0.1:43609~shs:5Zqfbx1bbiWfBchFnEX9a4bL3/ni4w7mwgNpzpFzGe8=
ok 7 connected
t=1.925022912s plugin=gossip remote=<@KiUu.ed25519> event=gossiprx live=false handleConnect="oops - dont have feed of remote peer. requesting..."
level=debug t=1.968505924s unit=gossip fr=<@5Zqf.ed25519> event=gossiptx n=4 starting=0
level=debug t=1.969101934s plugin=gossip event=gossiprx fr=<@KiUu.ed25519> starting=0 received=2 took=44.056022ms
t=1.969116534s plugin=gossip remote=<@KiUu.ed25519> event=gossiprx live=false msg="done fetching callee"got message!1
got message!2
got message!3
got message!4
send late msg
got message!5
waited
closed sbot
1..7
tests 7
pass 7
ok
JS Sbot process returned
level=debug t=2.378949843s event="network listen loop exited"
level=warn t=2.378986743s unit=gossip event="live qry on rxlog exited"
level=debug t=2.379009744s event="sbot closing" msg="connections closed"
level=debug t=2.379016444s event="sbot closing" msg="waited for indexes to close"
level=info t=2.384527634s event="sbot closing" msg="closers closed"
2022/12/18 13:42:33 saved identity @+SO47x38dQhF+DazKc1f4vli+v9xAkoLbUtbZHy5KRg=.ed25519 to testrun/TestStartup/secret
level=warn t=19.983727ms unit=gossip event="live qry on rxlog exited"
level=debug t=30.020893384s event="network listen loop exited"
level=debug t=30.020918985s event="sbot closing" msg="connections closed"
level=debug t=30.020978687s event="sbot closing" msg="waited for indexes to close"
level=info t=30.026641635s event="sbot closing" msg="closers closed"
--- FAIL: TestStartup (30.04s)
interop_test.go:123: go-sbot: @+SO47x38dQhF+DazKc1f4vli+v9xAkoLbUtbZHy5KRg=.ed25519
interop_test.go:272: timeout
interop_test.go:123: go-sbot: @+SO47x38dQhF+DazKc1f4vli+v9xAkoLbUtbZHy5KRg=.ed25519
local_fork_test.go:48:
Error Trace: /home/runner/work/go-ssb/go-ssb/tests/local_fork_test.go:48
Error: Not equal:
expected: int(2)
actual : int64(1)
Test: TestStartup
Messages: maggie seqno of log with 3 messages should be 2
local_fork_test.go:52:
Error Trace: /home/runner/work/go-ssb/go-ssb/tests/local_fork_test.go:52
Error: Not equal:
expected: int(3)
actual : int64(1)
Test: TestStartup
Messages: maggie seqno of log with 4 messages should be 3
FAIL
FAIL github.com/ssbc/go-ssb/tests 70.591s
FAIL
make: *** [Makefile:22: test] Error 1
I finally got a failure out of 684 runs. So it's not fixed yet, but it's really rare.
2023/01/29 04:52:00 [margaret/indexes/badger] overwrote batch limit 0
=== RUN TestStartup
2023/01/29 04:52:00 saved identity @WkV0DEbbnTTFXxi9Rq4VXjJLfOLoaHSZlgi8tYGLhgg=.ed25519 to testrun/TestStartup/secret
interop_test.go:123: go-sbot: @WkV0DEbbnTTFXxi9Rq4VXjJLfOLoaHSZlgi8tYGLhgg=.ed25519
level=warn t=47.790705ms unit=gossip event="live qry on rxlog exited"
interop_test.go:272: timeout
level=debug t=30.050335447s event="sbot closing" msg="connections closed"
level=debug t=30.050420557s event="sbot closing" msg="waited for indexes to close"
level=debug t=30.050294511s event="network listen loop exited"
level=info t=30.0705814s event="sbot closing" msg="closers closed"
interop_test.go:123: go-sbot: @WkV0DEbbnTTFXxi9Rq4VXjJLfOLoaHSZlgi8tYGLhgg=.ed25519
local_fork_test.go:53:
Error Trace: /home/go-ssb/go-ssb/tests/local_fork_test.go:53
Error: Not equal:
expected: int(2)
actual : int64(1)
Test: TestStartup
Messages: maggie seqno of log with 3 messages should be 2
--- FAIL: TestStartup (30.09s)
level=warn t=30.087880018s unit=gossip event="live qry on rxlog exited"
FAIL
FAIL github.com/ssbc/go-ssb/tests 30.116s
FAIL
Okay, I think I found it. It was doing Publish()
es and then immediately calling Seq()
directly on the feed log, which is based on the indexes. The feed log has no mechanism to ask the bot's indexing if the indexes are up-to-date, so we had a race condition. Going to let this re-run again for a while with that problem fixed and see if it fixes the test.