google/safebrowsing

Inconsistent Database: Threat List Mismatches

Opened this issue · 7 comments

Hello!

We're using the sbserver binary. Over the past 24 hours we've been seeing a series of inconsistent database errors that seem to resolve themselves after a database fetch, but during that 30 minute time window every request generates an error.

Log Excerpt
safebrowsing: 2021/01/12 11:16:20 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 11:16:21 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 11:16:21 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 11:16:24 database.go:389: database is now healthy
safebrowsing: 2021/01/12 11:16:24 safebrowser.go:563: background threat list updated
safebrowsing: 2021/01/12 11:16:24 safebrowser.go:557: Next update in 30m29s
/* snipped */
safebrowsing: 2021/01/12 17:49:46 database.go:243: Server requested next update in 29m49.769s
safebrowsing: 2021/01/12 17:49:46 database.go:257: update failure: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 17:49:46 safebrowser.go:557: Next update in 29m49.769s
safebrowsing: 2021/01/12 17:49:46 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 17:49:46 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 17:49:46 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch

We also sometimes see this variation.

Log Excerpt
safebrowsing: 2021/01/12 18:19:36 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 18:19:37 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 18:19:37 database.go:243: Server requested next update in 29m56.277s
safebrowsing: 2021/01/12 18:19:37 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 18:19:37 safebrowser.go:419: inconsistent database: safebrowsing: threat list SHA256 mismatch
safebrowsing: 2021/01/12 18:19:38 database.go:389: database is now healthy
safebrowsing: 2021/01/12 18:19:39 safebrowser.go:563: background threat list updated
safebrowsing: 2021/01/12 18:19:39 safebrowser.go:557: Next update in 29m56.277s
safebrowsing: 2021/01/12 18:49:35 database.go:243: Server requested next update in 30m23.724s
safebrowsing: 2021/01/12 18:49:38 safebrowser.go:563: background threat list updated
safebrowsing: 2021/01/12 18:49:38 safebrowser.go:557: Next update in 30m23.724s
safebrowsing: 2021/01/12 19:20:02 database.go:243: Server requested next update in 30m13.773s
safebrowsing: 2021/01/12 19:20:02 database.go:248: invalid server response: got 0, want 3 threat lists
safebrowsing: 2021/01/12 19:20:02 safebrowser.go:557: Next update in 30m13.773s
safebrowsing: 2021/01/12 19:20:02 safebrowser.go:419: inconsistent database: safebrowsing: threat list count mismatch
safebrowsing: 2021/01/12 19:20:02 safebrowser.go:419: inconsistent database: safebrowsing: threat list count mismatch

This is what is returned from status:

{"Stats":{"QueriesByDatabase":55981296,"QueriesByCache":34,"QueriesByAPI":22829,"QueriesFail":24489,"DatabaseUpdateLag":0},"Error":""}

Do you know what the issue might be? Is there a suggested way to debug this further? The other "inconsistent database" issues I saw looked to be accompanied by unexpected status codes.

Thanks for the report! We believe this is associated with a recent server push and will investigate further. Stay tuned for updates.

Thanks for looking into it!

Following up here -- it seems we have misalignment with our protocol specification. Specifically, our server does not guarantee returning a list update response for every requested list. We've clarified this in our public documentation and will soon send out a fix to remove this check.

Thank you!

Is it any update on the issue ?

Hello! We are seeing the similar issue starting around 9:50am PST 10-27-2021, do you know what issue it might be?

Also an user who encounter threat list count mismatch several times here, the latest one is on 2021-10-27
which is caused by code below

safebrowsing/database.go

Lines 246 to 251 in bbf0d20

if len(resp.ListUpdateResponses) != numTypes {
db.setError(errors.New("safebrowsing: threat list count mismatch"))
db.log.Printf("invalid server response: got %d, want %d threat lists",
len(resp.ListUpdateResponses), numTypes)
return nextUpdateWait, false
}

safebrowsing periodically sends request with listUpdateRequests[] to update different threatType, platformType, etc., and check if it receive same amount of response.

It looks like on 2021-10-27, it sent request with 3 items to update while only received 1 item in listUpdateResponses

{"level":"info","msg":"safebrowsing: 2021/10/27 14:28:49 database.go:248: invalid server response: got 1, want 3 threat lists","time":"2021-10-27T14:28:49.83952Z"}

Based on document, listUpdateResponses[] can be less than listUpdateRequests[]. So we comment out L247 and L250 as below to let it still update local list even receive different amount of response, and only record the mismatch event in log without raising db error.

if len(resp.ListUpdateResponses) != numTypes {
    //db.setError(errors.New("safebrowsing: threat list count mismatch"))
    db.log.Printf("invalid server response: got %d, want %d threat lists",
        len(resp.ListUpdateResponses), numTypes)
    //return nextUpdateWait, false
}