onejli/docker-vpn-helper

Ensure parity with certs generated by docker provisioning

batsatt opened this issue · 6 comments

This worked for me with 1.9.0d. After upgrading, all seems fine with docker, and docker-machine ssh works fine, but some docker-machine commands fail with what looks like a cert issue:

$ docker-machine version dev
Unable to query docker version: Get https://192.168.99.100:2376/v1.15/version: remote error: handshake failure

$ docker-machine env dev
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "192.168.99.100:2376": remote error: handshake failure
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which will stop running containers.

I tried modifying the generated cert to include DNS:localhost.

I tried upgrading docker-machine to 0.5.6.

Any thoughts?

Here are the original and generated server.pem files.
Archive.zip

Forgot to mention that I also tried clearing the VBox host-only networks.

I found the issue with the cert; more extensions are required, e.g.

echo "subjectAltName = DNS:localhost, IP:${machineIp}, IP:127.0.0.1" >> ${extFile}
if [[ ${clientCert} == *"Extended Key Usage"* ]]; then
    echo "keyUsage = critical, digitalSignature, keyEncipherment, keyAgreement" >> ${extFile}
    echo "extendedKeyUsage = clientAuth, serverAuth" >> ${extFile}
    echo "basicConstraints = critical, CA:FALSE" >> ${extFile}
fi

I was testing this without going on the (Cisco) VPN at all; unfortunately, once I did, I started seeing timeouts for docker-machine ls, env (I assume that any docker-machine command that uses https would similarly fail).

I'm running Docker Toolbox 1.10.0 and haven't noticed any handshake failures either on or off VPN. I only see (the "expected") timeouts for docker-machine commands that try to connect over https using the host-only adapter (e.g., ls and env) while on VPN. Can you please try upgrading and let me know if the handshake problem still exists?

That said, I really should ensure that the attributes of the cert generated by this script match those of the cert generated by docker-machine.

Adding some notes for when I get to this:

  1. provisioning step that generates the cert
  2. key usage and extended key usage fields
@@ -6,9 +6,9 @@
         Signature Algorithm: sha256WithRSAEncryption
         Issuer: O=jonathan.li
         Validity
-            Not Before: Feb  8 06:30:00 2016 GMT
-            Not After : Jan 23 06:30:00 2019 GMT
-        Subject: O=jonathan.li.default
+            Not Before: Feb  8 06:35:11 2016 GMT
+            Not After : Feb  7 06:35:11 2017 GMT
+        Subject: CN=localhost
         Subject Public Key Info:
             Public Key Algorithm: rsaEncryption
             RSA Public Key: (2048 bit)
@@ -16,13 +16,7 @@
                 Exponent: 65537 (0x10001)
         X509v3 extensions:
-            X509v3 Key Usage: critical
-                Digital Signature, Key Encipherment, Key Agreement
-            X509v3 Extended Key Usage: 
-                TLS Web Client Authentication, TLS Web Server Authentication
-            X509v3 Basic Constraints: critical
-                CA:FALSE
             X509v3 Subject Alternative Name: 
-                DNS:localhost, IP Address:192.168.99.100
+                IP Address:192.168.99.100, IP Address:127.0.0.1
     Signature Algorithm: sha256WithRSAEncryption

TODO Ensure parity for:

  • expiration date
  • organization
  • key usage
  • extended key usage
  • constraints

Edit:
I didn't notice it earlier, but it appears that localhost was added into the SAN field. Yay! I can remove this cert hack for Docker Toolbox 1.10.0 and newer. For reference, here's the cert as generated by docker-machine 0.4.1 without localhost or loopback address.

Just tried Toolbox 1.10.0 and was able to remove the cert generation logic from my variant of your script. Progress!

For the docker-machine ls, env etc, issues, I hacked virtualbox.go to hard code use of localhost:

func (d *Driver) GetIP() (string, error) {
    // DHCP is used to get the IP, so virtualbox hosts don't have IPs unless
    // they are running
    s, err := d.GetState()
    if err != nil {
        return "", err
    }
    if s != state.Running {
        return "", drivers.ErrHostIsNotRunning
    }

    return "localhost", nil; // TODO batsatt hack
}

Previously, I had this return 127.0.0.1, but this change was required to match the default SAN.

I just verified that this works with the latest machine source.