mozilla/tls-observatory

Be permissive in certificate decoding

jcjones opened this issue · 16 comments

Golang's crypto/x509 decoder is very strict.

  • This DPDHL TLS CA I3 intermediate CA fails to parse the dnsName constraint "leserservice-media.de " due to the trailing whitespace.

  • The Suva Root CA intermediate CA cert fails to parse its rfc822Name constraint "@suva.ch" due to the @ character.

(Note these certificates are also revoked and/or expiring, but that aside..)

crt.sh, openssl, and NSS all parse these certificates and return data. In the long run, it would be nice -- as a diagnostic tool -- for the Observatory to also have a permissive decoder, likely forked from crypto/x509, that could return all the data possible in the x509.Certificate struct along with a parallel list of decoding warnings rather than erroring out. This would be useful for certificate diagnostics for websites affected (instead of just bailing out), and also for the CCADB's parsing code.

This is a departure from the Golang core ethos, so it probably belongs in its own Go library - a permissive/x509 package or some such.

Should we just switch to the google/certificate-transparency-go forks? They have the same reasoning of needing more permissive parsing.

It looks like they handle updating from upstream Go every few months too.

https://github.com/google/certificate-transparency-go#repository-structure

I like that idea.

Well, it's not going to be that easy.

package main

import (
	"fmt"
	"io/ioutil"

	"github.com/google/certificate-transparency-go/x509"
)

func main() {
	cases := []string{
		"12729537.der",
		"12728748.der",
	}

	for i := range cases {
		cert, err := parse(cases[i])
		if err != nil {
			fmt.Printf("%s: %v\n", cases[i], err)
			continue
		}
		if cert == nil {
			fmt.Printf("%s: nil cert\n", cases[i])
		}
		fmt.Println(cert.Subject.String())
	}
}

func parse(path string) (*x509.Certificate, error) {
	bs, err := ioutil.ReadFile(path)
	if err != nil {
		fmt.Printf("%s: %v\n", path, err)
		return nil, nil
	}
	return x509.ParseCertificate(bs)
}
$ go run /tmp/parse.go 
12729537.der: x509: failed to parse dnsName constraint "leserservice-media.de "
12728748.der: x509: failed to parse rfc822Name constraint "@suva.ch"

12729537.der and 12728748.der are just openssl x509 converted blobs from the crt.sh PEM.

Oh cool, the zmap/zcrypto fork works though! https://github.com/zmap/zcrypto/tree/master/x509

$ go run /tmp/parse.go 
C=DE, ST=Nordrhein-Westfalen, L=Bonn, OU=IT Services, O=Deutsche Post DHL, CN=DPDHL TLS CA I3
C=CH, ST=Luzern, O=Suva, CN=Suva Root CA 1

Also, because we use golang.org/x/crypto/ocsp I had to patch that to use the non-stdlib x509 package before compile would work again. Luckily it's just one file so we should be able to fork that pretty easily.

This sound like we need to try several parsers and store the errors from each in a new parsing_errors column of the certificate table. Do I sense a patch coming our way, @adamdecaf ?

We could try multiple parsers. It looks like zmap/zcrypto doesn't attempt validation, which sounds ok here too?

The relevant source from zcrypto:

switch subtree.Value.Tag {
case 1:
   out.PermittedEmailAddresses = append(out.PermittedEmailAddresses, GeneralSubtreeString{Data: string(subtree.Value.Bytes), Max: subtree.Max, Min: subtree.Min})
case 2:
   out.PermittedDNSNames = append(out.PermittedDNSNames, GeneralSubtreeString{Data: string(subtree.Value.Bytes), Max: subtree.Max, Min: subtree.Min})

Do feel free to file bugs against Google's code, @adamdecaf - the intention is that code should accept any cert accepted as valid by any mainstream browser...

Oh. I see you have!

There's already a fix in the upstream CT code now. Should we still try both libraries @jvehent ?

I want to be careful about the approach we take here. It looks like the CT parser will continue and ignore the field it cannot parse, potentially returning an incomplete list of constraints, and that's a bad idea for our use case.

Does the zcrypto parser return the full list of constraints?

In case it's helpful for quick comparison, the parsed certificate output presented in Censys is directly from ZCrypto (and AFAICT does catch those). Links:

DPDHL TLS CA I3: https://censys.io/certificates/5ffdede82957b43d4676b1cfcc39ceb150dc63dbfc33e26d99caa9b9762a4564/raw

Suva Root CA 1: https://censys.io/certificates/832266d6ba8cbfcbf28e0614a01d9f4c39b8e41f7c87d2077dbb6c03840ca9c2/raw

As @adamdecaf mentions, we generally don't attempt to validate the correctness of attributes in ZCrypto, just purely parse out what's there. That said, we do have another sister library, ZLint, which takes an X.509 object from ZCrypto and checks the correctness of attributes against RFC 5280 and the baseline requirements.

It seems easy enough to switch parsing over to zcrypto, but I assume we'd want to run zlint (and stdlib crypto/x509) over the certificates to store errors.

I'm OK going with zcrypto, and I like the idea of storing zlint results in a new ::jsonb column of the certificate table. One remaining question is: would it impact/change the way we parse certconstraints in the certificate package today?

One remaining question is: would it impact/change the way we parse certconstraints in the certificate package today?

I don't think so. The tls-observatory only worries about Permitted/Excluded DNSDomains and IPAddresses, both of which zcrypto has fields for and parses out. We'll just need to make the trivial import changes.


I see the tls-observatory needing to convert between zcrypto.Certificate and certificate.Certificate, which looks a bit annoying because we'll need to setup mapping between all the subtypes and fields on the two structs. (We can't defer to a .ParseCertificate call as that'll run into the original bug.) I'm working up a PoC for this.