caddyserver/certmagic

Question: how to use on demand with managed certificates

svedova opened this issue ยท 9 comments

What is your question?

Hi ๐Ÿ‘‹

We have the following use case:

  • We need to manage a wildcard certificate: *.example.org
  • We need to manage unknown domains domain1.com, domain2.com etc...

How should we set up the server? We're using only certmagic and no Caddy Server.

What have you already tried?

We tried something similar, it works but I'm not sure if that's the right way to do:

config := &certmagic.Config{}
cache := certmagic.NewCache(certmagic.CacheOptions{ 
  GetConfigForCert: func(cert certmagic.Certificate) (*certmagic.Config, error) {
    return config, nil
  },
}

// Set also the the issuer
// ...

server := certmagic.New(cache, *config)
server.ManageAsync(context.TODO(), []string{"*.example.org"});

// And then we specify the on demand config:

certmagic.Default.OnDemand = &certmagic.OnDemandConfig{
  DecisionFunc: func(name string) error {
    // compute if on demand domain certificate should be issued or not
    return nil
  },
}

certmagic.HTTPS([]string{}, handler);

To my understanding, the DecisionFunc should not be called when a domain like this is requested: subdomain.example.org because we manage that async and it should be called only for non-managed domains.

Are we on the right path?

Also, when we do something like this instead of managing async:

certmagic.Default.OnDemand = &certmagic.OnDemandConfig{
  DecisionFunc: func(name string) error {
    // compute if on demand domain certificate should be issued or not
    return nil
  },
}

certmagic.HTTPS([]string{"*.example.org"}, handler);

and then visit subdomain.example.org it creates a certificate for subdomain.example.org and does not use the wildcard certificate. Why is that like that?

mholt commented

Hi @svedova -- just catching up with things and seeing your questions now. It's almost midnight though so let me quickly answer and then try to revisit this again soon.

Hard to know for sure what is going on at my initial glance, but first thing I would do is set a Logger so that you can get more details, and post them here if possible.

Because yeah, CertMagic should prefer to use a matching cert if it has one, even if it's a wildcard.

The wildcard domain does typically require the DNS challenge to be enabled as per CA requirements. So if the CA refuses to issue a cert, CertMagic won't have one thus it will try to get one on-demand for the specific subdomain. So make sure it's able to get a wildcard cert.

I'm facing this same issue at the moment.

It is possible to specify multiple ACMEIssuers. One with the DNS solver, and another with the TLS-ALPN/HTTP solver. However...

  1. If the TLS-ALPN solver is used first, it will obtain a certificate for sub.example.com.
  2. If the DNS solver is used first, there seems to be no way to force it to obtain a certificate for *.example.com instead of sub.example.com.

ManageSync and ManageAsync only add the wildcard domain to the hostWhitelist if OnDemand is not nil, but then, if DecisionFunc() is set, the hostWhitelist is never used.

My suggestion is to either:

Make OnDemand.DecisionFunc() return the host that a certificate should be obtained for instead of simply a boolean.
OR.
Always check the hostWhitelist before the DecisionFunc(). This way, calling ManageSync() or ManageAsync() will give the wildcard host precedence

If that sounds good, I'll be happy to send a PR

I have the same question about managing both static wildcard and user-defined hostnames. What is the best strategy for such combination?

mholt commented

I think there may be 2 or 3 different questions in this thread.

My initial understanding of the question is how to combine a static, managed wildcard cert with unknown on-demand certs. The next question appears to be about mixing different solvers, and then the last question might be about dynamic hostnames with static wildcards.

There's probably a conflation of different features and nuances going on here.

Can someone provide minimal, runnable example code to illustrate the specific issue? That will help me understand things a little better. The example at the very top doesn't make much sense to me because server is never used to actually serve anything.

Can someone provide minimal, runnable example code to illustrate the specific issue? That will help me understand things a little better. The example at the very top doesn't make much sense to me because server is never used to actually serve anything.

@mholt I'm a bit confused. What o you mean by server not being used to serve anything? It's used to ManageAsync the fixed domains.

Our use case is the following:

  • we have a top-level domain. Let's call it example.org.
  • we also have dynamic subdomains such as client-a.example.org and client-b.example.org.
  • finally, we also have dynamic top-level domains such as example-1.org and example-2.org.

All the websites are served through the same server and we're using CertMagic for that.

Now example.org and *.example.org can be served through 2 Certificates. 1 for example.org and the other one using the wildcard certificate. To my understanding, for these, we need to use the ManagedAsync function. Whereas for other top-level domains, we need to rely on the DecisionFunc.

The question in the OP is simply asking for guidance for this use case.

I hope this is a bit more clear now.

mholt commented

@svedova Thanks for the explanation.

I'm a bit confused. What o you mean by server not being used to serve anything? It's used to ManageAsync the fixed domains.

When you create a certmagic.Config (named server in the above code), you typically need to use the certificate by getting a TLSConfig() from it. Otherwise the certs just get maintained in storage, but isn't used for anything.

The OnDemand func in the code above is being set on a totally different CertMagic instance with a different cache, so it obviously won't know about what the server instance is doing.

That's why I'm confused -- I'm trying to figure out which question to answer to give the most relevant help.


Now example.org and *.example.org can be served through 2 Certificates. 1 for example.org and the other one using the wildcard certificate. To my understanding, for these, we need to use the ManagedAsync function. Whereas for other top-level domains, we need to rely on the DecisionFunc.

That's accurate, as far as I read it.

However, the earlier statement in OP:

the DecisionFunc should not be called when a domain like this is requested: subdomain.example.org because we manage that async and it should be called only for non-managed domains.

I am not sure that is true given the code, because there are 2 instances of CertMagic in use, and so the config with the DecisionFunc doesn't know that the cert for subdomain.example.org (or, specifically, the wildcard cert) exists.

@mholt thanks for the quick answer. I'll revisit the code and read more about the TLSConfig implementation.

Somehow, though, the code above still works and certificates are generated as expected. We don't have issues since I posted this issue so I guess from my side you can close this issue.

mholt commented

Somehow, though, the code above still works and certificates are generated as expected.

Oh. That's good I guess. But what about this?

Also, when we do something like this instead of managing async: ... and then visit subdomain.example.org it creates a certificate for subdomain.example.org and does not use the wildcard certificate. Why is that like that?

I think the answer to that question is what I was just saying, where the server isn't used in that configuration, so it tries to get a certificate on-demand using the ServerName from SNI, rather than the cert that is being managed by the separate server instance.

We don't have issues since I posted this issue so I guess from my side you can close this issue.

Well, cool! :)

I'll close this since there doesn't seem to be anything actionable, and I'm quite a bit confused as to what any issues or questions are. If there are further questions others have, maybe best to open a new issue at this point, with very specific code and question.