PowerDNS/simpleblob

Update Marker functionality not working correctly

franklouwers opened this issue · 5 comments

When using (real) S3 buckets, the update-marker function doesn't work when using a Prefix (as is recommended for the LightningStream code).

I believe this might be because the Update Marker is checked here:

https://github.com/PowerDNS/simpleblob/blob/main/backends/s3/s3.go#128

which performs a Load() of the markerfile, which starts with prepending the globalPrefix here:

https://github.com/PowerDNS/simpleblob/blob/main/backends/s3/s3.go#L204

However, the Marker is stored without the Prefix.

The interesting this is that the fallback mechanism at https://github.com/PowerDNS/simpleblob/blob/main/backends/s3/s3.go#L138 also doesn't work: even after the UpdateMarkerForceListInterval is passed, no LIST call is performed. Could this be because the Load() function doesn't return an error when it can't find the file? Or that that error isn't properly handled?

wojas commented

Thanks for reporting! It does look like setMarker should indeed be adding the prefix.

The interesting this is that the fallback mechanism at https://github.com/PowerDNS/simpleblob/blob/main/backends/s3/s3.go#L138 also doesn't work: even after the UpdateMarkerForceListInterval is passed, no LIST call is performed. Could this be because the Load() function doesn't return an error when it can't find the file? Or that that error isn't properly handled?

We get a 'does not exist error' and then happily use the empty string as the marker, which means that the marker is the same as the last time we checked the marker. But I'm not sure what is going wrong with this interval check then.

@wojas re fall-back mech: I noticed that the behaviour without marker (https://github.com/PowerDNS/simpleblob/blob/main/backends/s3/s3.go#L125) is quite different from the behaviour with marker, but falling back: https://github.com/PowerDNS/simpleblob/blob/main/backends/s3/s3.go#L147

Seems like you strip the globalPrefix in the doList() function https://github.com/PowerDNS/simpleblob/blob/main/backends/s3/s3.go#L184-L188. So a regular List() with the marker flag disabled will return the blob list without the prefix.

Yet the fallback executes doList() without prefix (so no stripping), and then filters them here: https://github.com/PowerDNS/simpleblob/blob/main/backends/s3/s3.go#L158. However, the filenames would still include the prefix.

Also seems like the purpose of the marker, is to mark for multiple prefixes. So your suggestion to just add the prefix when storing the marker, might not be in line with that idea.

wojas commented

The global prefix is meant to allow the use of a 'subdirectory' of an S3 bucket to sync data. This can also conveniently be used to separate different Lightning Stream schemaversions. In this use case it makes sense to have each prefix have its own marker file. The user may not even have write permissions to the root of the bucket.

Yes, I think so as well (except for possible schema migrations later, but those would likely be best to do a "full sync" anyway). In that case, there's a lot of logic at the bottom of the List() function which can be simplified.