Feature Request: Automatic In-Memory TLS Certificate Reload Based on File Changes
Opened this issue · 1 comments
What is the problem your feature solves, or the need it fulfills?
Currently, when TLS certificates are renewed and the underlying certificate and private key files are updated on disk, Pingora requires a mechanism to start using these new credentials.
The primary supported method for this with zero downtime for existing connections is the graceful upgrade (SIGQUIT on Unix), which involves stopping the old process and starting a new one that then reads the updated files. While effective for zero-downtime, this still involves a full process restart.
My current workaround is more direct but involves downtime: I detect the certificate renewal, then I completely kill the running Pingora process and start it again. The new process, upon startup, naturally reads and loads the new certificate and key files. This approach, while simple to implement from an external scripting perspective, causes a service interruption.
This feature request is for developers and operators who need to update TLS certificates frequently and wish to do so with minimal operational overhead and without any process restart (neither a full kill/restart nor a graceful upgrade process handoff), thereby allowing the existing Pingora process to continue running while seamlessly transitioning to new certificates for new connections.
Describe the solution you'd like
I propose a feature where Pingora can automatically detect changes to its configured TLS certificate and private key files and reload them into memory for use with new incoming TLS connections, without requiring a process restart or graceful upgrade.
A possible way this could work:
- Initial Load & Metadata Storage: When Pingora loads a TLS certificate and private key from file paths specified in
TlsSettings(e.g., viaTlsSettings::intermediate()), it would also store metadata about these files, specifically their last modification timestamps (or creation dates, though modification timestamps are generally more reliable for detecting updates). - Periodic Background Check: Pingora (perhaps via an opt-in background task associated with the TLS listener or a global server setting) would periodically check the modification timestamps of the configured certificate and private key files on disk. The check interval could be configurable.
- Detection of Change: If the current modification timestamps of the files on disk are newer than the timestamps stored from the last successful load, Pingora would recognize that the files have been updated.
- In-Memory Reload: Upon detecting an update, Pingora would:
- Attempt to load the new certificate and private key from the configured paths.
- If successful, it would update its internal TLS context (e.g.,
SSL_CTXor equivalent for the chosen TLS backend) for the relevant listener(s) to use these new credentials. - This new TLS context would be used for all subsequent new TLS handshakes.
- Existing, established TLS connections would continue to use the certificate they were established with until they naturally terminate.
- Error Handling: If loading the new files fails (e.g., malformed certificate, incorrect permissions), Pingora should log the error clearly and continue using the existing (old) valid certificate, possibly retrying the load on the next check interval.
- Configuration: This feature could be enabled per-listener or globally, perhaps with a setting like
auto_reload_tls_certs: trueandtls_cert_check_interval_secs: 300within theTlsSettingsor server configuration.
This mechanism would allow the Pingora process to remain running continuously while seamlessly adopting new TLS certificates as they become available on the filesystem.
Describe alternatives you've considered
-
Current Graceful Upgrade (
SIGQUIT):- Description: The existing mechanism where the old process passes FDs to a new process. The new process reads the updated certs from disk.
- Tradeoffs:
- Pros: Robust, handles all types of configuration changes, zero downtime for client connections.
- Cons: Still involves stopping one process and starting another, which has some overhead (process creation, re-initialization of application state not carried over FDs). For just a certificate update, this can feel like a heavier operation than necessary.
-
Manual Kill and Restart (My Current Workaround):
- Description: Stop the Pingora process entirely and start a new one.
- Tradeoffs:
- Pros: Simple to implement externally.
- Cons: Causes service downtime during the restart period. Not suitable for production environments requiring high availability.
-
External Signal + Manual In-Memory Reload API (A different proposal):
- Description: An external script detects cert changes and sends a specific signal (e.g.,
SIGHUPorSIGUSR1) to Pingora, or calls an admin API endpoint. Pingora then re-reads the cert files. - Tradeoffs:
- Pros: More direct control than periodic polling. Could be faster if the signal is immediate.
- Cons Compared to Proposed Solution: Requires more active external orchestration to trigger the reload. The proposed file-watching solution is more "set and forget." The proposed solution is also self-contained within Pingora once configured.
- Description: An external script detects cert changes and sends a specific signal (e.g.,
The proposed solution (automatic detection via file modification timestamps) offers a good balance by being self-contained within Pingora once enabled, requiring minimal external orchestration, and avoiding process restarts entirely for certificate updates.
Additional context
This type of automatic certificate reloading based on file changes is a common feature in other reverse proxies and web servers (e.g., Nginx can be triggered to reload certificates with SIGHUP after files are updated, Caddy reloads them automatically). Implementing a similar self-monitoring capability in Pingora would enhance its operational ease-of-use for managing TLS certificates, especially in automated environments.
The key goal is to allow the main Pingora process to live indefinitely while its TLS credentials can be updated transparently for new connections simply by replacing the certificate files on disk.
Thank you for considering this feature.
@scottgre you might want to take a look at this references first? #611 ? which utilized boringssl and dynamic cert setup?
for the quick concept you can do
[global_certlist_allocated]
{allocate using RwLock, Mutex, static mut or lazy_static}
[thread_A]
{doing hot-reload from file}
[boringssl]
{check in memory saved checksum, if checksum diff, fetch newest}