GoogleCloudPlatform/cloud-sql-python-connector

Recover gracefully from sleep

enocom opened this issue · 1 comments

Feature Description

When a Connector is running on a machine that goes to sleep for more than 1 hour, when the machine awakes, it will have an expired certificate. Given how TLS 1.3 works, the Connector will not see a failed handshake and will force users to restart the process to fix the problem.

Instead, we should check if the certificate retrieved from the cache is invalid. If it is, we should block on a force refresh attempt until we get a refresh cert.

See GoogleCloudPlatform/cloud-sql-proxy#1788 and GoogleCloudPlatform/cloud-sql-go-connector#686 for details.

This should be ported to AlloyDB Python as well.

This is a comment to make sure proper debug logs are added around valid cert check:

  • The result of the invalid certificate check (i.e., is “now” after the expiration)
  • Valid cert = True/False