Dynamic TLS Certificates
Closed this issue ยท 21 comments
Node.js provides SNICallback
in its https
module to get the appropriate certificate dynamically, without needing a predefined set. This is a very useful feature, as it allows to change, add, or remove certificates without restarting the server.
const https = require('https');
const server = https.createServer({
SNICallback: (hostName, callback) => {
const certificate = getCertificate(hostName);
if (!certificate) {
return callback(new Error('No certificate found for host name ' + hostName));
}
return callback(null, certificate);
}
});
server.listen(443);
FibJS does not currently support this feature, but it would be great if it did. If FibJS were to support this feature, it would need to provide a getCertificate(hostName)
method in its SslServer
and HttpsServer
classes. This method would take the hostName
as an argument and return the appropriate certificate.
const server = new HttpsServer(/* Array certs, */ Integer port, Handler listener);
server.getCertificate = function(name) {
return {
certificate: [X509Cert object],
key: [PKey object]
};
};
server.start();
The current version of FibJS supports SNI certificates, but this feature is not designed for updating certificates. However, the development branch of FibJS is currently upgrading the SSL library to OpenSSL, and SNI is considered an insecure protocol. Therefore, SNI support will not be included in the version of FibJS after upgrading to OpenSSL.
Given that your requirement is to update certificates, I recommend restarting the server. Updating certificates is not a frequent operation, and introducing a callback during SSL handshake for this purpose may not be very worthwhile.
I checked the documentation of node.js and found that it supports setSecureContext
:
https://nodejs.org/docs/latest/api/tls.html#serversetsecurecontextoptions
This seems like a better way to change certificates.
Given that your requirement is to update certificates, I recommend restarting the server. Updating certificates is not a frequent operation, and introducing a callback during SSL handshake for this purpose may not be very worthwhile.
Imagine an object-based storage service with user-defined domains. The getCertificate
feature allows adding or removing user certificates without restarting the server. Therefore, in multi-tenant applications with a lot of custom domains, this feature is useful.
NGINX supports this feature using njs
and Lua:
As a radical idea, I plan to use FibJS instead of NGINX as a reverse proxy in a CDN in the future.
what about setSecureContext
, it can also be used to set cert/key without restarting the server.
Yes, But it is good for only one SecureContext
.
Suppose that I have 10,000 secure contexts and each secureContext.cert
(a single SSL certificate) may have additional host names (Subject Alternative Name) and wildcards.
let contexts: Array<SecureContext> = [];
for (let row of db.select('domain_certificates')) {
contexts.push(createSecureContextFrom(row));
}
server.setSecureContext(contexts); // it is wrong.
setSecureContext
accepts only a single secureContext
.
I think that we need a separate function like getCertificate
instead of setSecureContext
. because:
- How to pass multiple secure contexts to
setSecureContext
? - What algorithm is used to find a match by
setSecureContext
? Linear search or better algorithms? - How does
setSecureContext
support SAN and wildcards?
If we have a user-defined function getCertificate
, it can handle the mentioned problems.
Its input is domain name and its output is the corresponding SecureContext
.
So the point is that you need SNI support insteed of changing cert.
We need to consider a few issues:
- there are security concerns related to SNI technology itself, and Chrome browser has already provided an option to disable it. However, many services are still using SNI, so it may not be a problem for now.
- I still don't like switching to JavaScript to query certificates every handshake, as it may affect performance. This is something to consider in high-performance scenarios.
As a solution, I'm considering whether we can achieve this by extending setSecureContext
, for example, adding a domain template parameter to setSecureContext
to modify the certificate of the specified domain.
This way, SSL handshake can be done without switching to the JS environment. If required by the business, we can modify the certificate of the specified domain template at any time.
- What algorithm is used to find a match by
setSecureContext
? Linear search or better algorithms?
good point. we should consider the lookup algorithms when there ars lots of certificates that need to be matched.
We can use an optimized template algorithm to solve the scaling problem. For example, if the domain name is not a template, we use map to look it up, if it's not found in the map, then we use template traversal.
and by using map, we can optimize template queries by continuously removing subdomains of the domain name and searching in different maps.
In SNICallback
, I can change/add/remove SecureContext
s seamlessly. Am I right?
If SNI is disabled in the web browsers, how will the server send the right certificate?
Also, I don't know how the new TLS is implemented without SNI in low-level code.
I still don't like switching to JavaScript to query certificates every handshake
Yes, you are right. But this overhead is only for users who have configuredSNICallback
and for other users, the TLS handshake will be done without switching to the JS environment.
I think that SNICallback
can be optimized by WASM in the future without making FibJS complex.
Can we have these two solutions (template
and SNICallback
) together?
In
SNICallback
, I can change/add/removeSecureContext
s seamlessly. Am I right?
yes.
If SNI is disabled in the web browsers, how will the server send the right certificate?
the server just send the default certificate. Cloudflare works in that manner.
I think that
SNICallback
can be optimized by WASM in the future without making FibJS complex.
Actually, the problem does not lie with JavaScript, but rather with the environmental switch.
Can we have these two solutions (
template
andSNICallback
) together?
SNICallback is eaier, I just dont like it.
I will think it over.
Actually, the problem does not lie with JavaScript, but rather with the environmental switch.
Is it possible to define a function for your solution in CPP that does not have the environmental switch cost, but I can override it in JS? Or am I thinking wrong?
Thank you for your time.
plan A:
We can first lookup the certificate for domain in cache, If it cannot be found, then call SNICallback
to resolve it.
plan B:
We can use SNICallback
as a one-time function call for each domain.
After the SNICallback
is called with a specific domain, fibjs will save the certificate with the domain in the cache.
If a user wants to change the certificate for a specific domain, they can do so using setSecureContext
.
=============
maybe plan A is better. It is easier to implement and the results of the code are easier to understand.
I think getSecureContext
/secureContextResolver
is a better name than SNICallback
.
Plan A is good. But, how can users manage the cache?
Imagine that we have 10,000 SecureContext
s.
- Most of them are rarely-used and should not be in the cache.
- Over time, some of them become useless and must be removed.
- The cache needs a limit to store only frequently-used
SecureContext
s.
In the long run, the cache will be polluted and should be cleared.
Users need an API for cache management to prevent FibJS from leaking memory for unused SecureContext
s.
FibJS uses getSecureContext
to implicitly add a new SecureContext
to the cache. The following solutions can be used to change/remove SecureContext
:
Solution A
It is good if each SecureContext
has a timeToCache
. But, SecureContext
may share between multiple domains and FibJS must periodically remove expired SecureContext
s from cache.
Solution B
Make SecureContext.key
and SecureContext.cert
readwrite instead of readonly. So, users can implicitly change a SecureContext
without manipulating the cache.
Users use clearSecureContexts
/removeSecureContext(name)
to explicitly remove unused SecureContext
s from the cache.
Solution C
Is it possible for FibJS to use JSMap
for caching and expose it to users?
What is the difference between plan A and plan B?
- In plan B for each domain,
SNICallback
is called only one time and FibJS caches its result forever. - Plan A is like plan B but the cache will periodically be cleared.
- Or in plan A, FibJS will not cache the result of
SNICallback
. - Or something else?
Cool, there are so many ideas. Let us sum up:
- we need a js callback function, so that we can catch missing certificates and resolve it.
- function name may be
getSecureContext
/secureContextResolver
, it is better thanSNICallback
. - the callback function does not need to be compatible with
SNICallback
, it can be a synchronous function, which will make it less likely to go wrong.
- function name may be
- we need to cache the certificatest to avoid calling the js function every time.
- we can cache the certificatest using the CN name and SAN list.
- we need to automatically clean the cache to avoid storing too many certificates in it.
- We can use the LRU algorithm to manage the cache, where we can set update timeouts and cache sizes. Certificates that exceed the timeout will be cleared, and when the size is exceeded, the least recently used certificates will also be cleared.
- We need a manual cleaning method that can actively clear the cache.
- it may be
clearSecureContexts
/removeSecureContext(name)
- it may be
we can cache the certificatest using the CN name and SAN list.
Let's look at some examples of edge cases:
- An end-user may make a certificate using multiple domains, but want to use one of them in our service. so the cache will be polluted.
- An end-user may make multiple certificates over time and want to use the latest SAN list and some domains from the first SAN list.
- Also the cache lookup algorithm must check wildcards.
So I think it is better to cache each SecureContext
with the same name that FibJS passes to getSecureContext
without using the CN name and SAN list.
sure, make sense.
I created another issue (#779) to prevent this issue from being off topic.
Can Torque prevent/reduce the environmental switch cost to use getSecureContext
without caching?
It's almost done. it looks like this:
it('set/get', () => {
var ctx = tls.createSecureContext(true);
ctx.setSNIContext("test", sni_resolver("test"));
ctx.setSNIContext("test1", sni_resolver("test1"));
assert.equal(ctx.getSNIContext("test").cert.subject, 'CN=test');
assert.equal(ctx.getSNIContext("test1").cert.subject, 'CN=test1');
});
it('resolver', () => {
var ctx = tls.createSecureContext({
"SNIResolver": sni_resolver
}, true);
assert.equal(ctx.getSNIContext("test", true).cert.subject, 'CN=test');
assert.equal(ctx.getSNIContext("test1", true).cert.subject, 'CN=test1');
});
it('delete', () => {
var ctx = tls.createSecureContext(true);
ctx.setSNIContext("test", sni_resolver("test"));
ctx.setSNIContext("test1", sni_resolver("test1"));
ctx.removeSNIContext("test");
assert.equal(ctx.getSNIContext("test"), undefined);
assert.equal(ctx.getSNIContext("test1").cert.subject, 'CN=test1');
});
it('size', () => {
var ctx = tls.createSecureContext({
"SNICacheSize": 2
}, true);
ctx.setSNIContext("test", sni_resolver("test"));
ctx.setSNIContext("test1", sni_resolver("test1"));
ctx.setSNIContext("test2", sni_resolver("test2"));
assert.equal(ctx.getSNIContext("test"), undefined);
assert.equal(ctx.getSNIContext("test1").cert.subject, 'CN=test1');
assert.equal(ctx.getSNIContext("test2").cert.subject, 'CN=test2');
});
it('timeout', () => {
var ctx = tls.createSecureContext({
"SNICacheTimeout": 200
}, true);
ctx.setSNIContext("test", sni_resolver("test"));
coroutine.sleep(400);
assert.equal(ctx.getSNIContext("test"), undefined);
});
done.
2d9024d