appmattus/certificatetransparency

New SSLPeerUnverifiedExceptions after migrating from 1.1.1 to 2.4.7

Opened this issue ยท 3 comments

Hey ๐Ÿ‘‹

First of all, thanks for all your work on the library and its maintenance ๐Ÿ‘

I recently updated the library from v1.1.1 to v2.4.7 to be able to use the v3 log lists without any custom setup of the lib, and follow the latest developments. And since the app versions containing this update have spread to our production users, we spotted new reports of SSLPeerUnverifiedExceptions in Firebase reports.

Those reports only concern app versions containing the lib update and not previous app versions running with the v1.1.1:
โฌ‡๏ธ
Screenshot 2023-07-26 at 17 44 18

Screenshot 2023-07-26 at 17 45 24

For the migration itself, I went from this (1.1.1 library version setup):

private const val CERTIFICATE_TRANSPARENCY_LOG_LIST_V3 = "https://www.gstatic.com/ct/log_list/v3/"

.addNetworkInterceptor(certificateTransparencyInterceptor {
    // Use the v3 CT log list since the v2 one will be dropped by Google on November 17th.
    // The CT library we use doesn't support v3 yet, so we need to change the base url ourselves:
    // https://github.com/appmattus/certificatetransparency/issues/44
    val logListV3 = LogListDataSourceFactory.createLogListService(
        baseUrl = CERTIFICATE_TRANSPARENCY_LOG_LIST_V3
    )
    setLogListService(logListV3)
})

to this (2.4.7 library version setup)

.addNetworkInterceptor(certificateTransparencyInterceptor())

when building the OkHttpClient of the app. This way to do follows what's described in the docs for OkHttp setups: https://github.com/appmattus/certificatetransparency/blob/main/docs/okhttp.md

So does it ring a bell to anyone? Should I setup the library differently, through installCertificateTransparencyProvider for example, to get better chances to avoid those failures?

Thanks in advance for any help ๐Ÿ™

When I was debugging #97 I could see that coroutine job cancellations get transformed into SSLPeerUnverifiedException and then this may be bubbling up to some place where you are logging errors. I am also getting a ton of these in crashlytics.

Thanks for your insights @sudojhill! That can explain a part of the reports I'm getting indeed, but I also got feedbacks about users being suddenly blocked in the app because network calls are all blocked, and they have to restart the app to then be able to use the app, which is really impacting the user experience.
For info, I rolled back to the 1.1.1 version of the lib unfortunately (with custom setup to use v3 log lists) to make the user experience good again ๐Ÿ˜ฌ

The network calls being blocked I believe is also explained by the ticket I mentioned. Once you are in that state, you are stuck.

#98 is another ticket I raised that questions the caching mechanism because once the cache is considered too old but the log list hasn't been updated, then every network request makes a network call to the remote log list and causes #97 very easily.

If you wanted to try something, you could try using LogListDataSourceFactory.createDataSource and change the last parameter to something like Instant.now().minus(14, ChronoUnit.DAYS) so that the local cache does not expire so quickly and hopefully the remote log list is updated within 14 days. The default is 1 day which I think is not enough.