ceph/go-ceph

TestRadosGWTestSuite/TestUserBucket is consistently failing in CI

Closed this issue · 5 comments

It started failing on ceph main runs a few days days back:
https://github.com/ceph/go-ceph/actions/runs/8698488588

Now it is also failing on pre-squid:
https://github.com/ceph/go-ceph/actions/runs/8730978262

This pattern looks a bit like something changed in RGW and perhaps just got backported to squid very recently. This latter suggestion may help us find out what the change was (maybe?).

Any one somewhat well versed in RGW want to look into this?

Even though it is failure behaviour seems to be right for error. RGW was sending empty list for unknown user, now it is sending error. I tired to tracked which PR might cause this change. And looks like ceph/ceph@5083612 made a lot of changes for that API

Even though it is failure behaviour seems to be right for error. RGW was sending empty list for unknown user, now it is sending error. I tired to tracked which PR might cause this change. And looks like ceph/ceph@5083612 made a lot of changes for that API

@thotz Would you be willing to do a PR to fix that? Because you seem to be the last man standing who knows something about RGW. 😬

I tried to do some mapping w.r.t git commit hashes and container image hashes around the time when CI failure was first reported and kind of concluded on ceph/ceph#56863 as the change which converted response from empty list([]) to null.

Previously:

HTTP/1.1 200 OK
Content-Length: 2
Connection: Keep-Alive
Date: Tue, 16 Apr 2024 01:41:10 GMT
Server: Ceph Object Gateway (squid)
X-Amz-Request-Id: tx00000c270c1b2e7b9ad07-00661dd736-4196-default

[]

and now:

HTTP/1.1 200 OK
Connection: Keep-Alive
Date: Thu, 18 Apr 2024 01:40:53 GMT
Server: Ceph Object Gateway (squid)
X-Amz-Request-Id: tx0000031b3662f2b264cdc-0066207a25-4188-default
Content-Length: 0


=== NAME  TestRadosGWTestSuite/TestUserBucket
    user_bucket_test.go:37: 
        	Error Trace:	/go/src/github.com/ceph/go-ceph/rgw/admin/user_bucket_test.go:37
        	Error:      	Received unexpected error:
        	            	failed to unmarshal radosgw http response. . unexpected end of JSON input
        	Test:       	TestRadosGWTestSuite/TestUserBucket

@ansiwen @phlogistonjohn @anoopcs9 We can modify the test case to treat it error rather than no error. But this may require the versioning check for ceph. Otherwise we can remove the test case entirely since it is checking an invalid case

@ansiwen @phlogistonjohn @anoopcs9 We can modify the test case to treat it error rather than no error. But this may require the versioning check for ceph.

Modifying the test case accordingly calls for new test file with proper build tags to differentiate between Ceph versions. Let's not go that way.

Otherwise we can remove the test case entirely since it is checking an invalid case

I would vote for the removal of those related test cases.