hashicorp/go-discover

Provider K8s does not support hostname for pods - breaks TLS setups

innovia opened this issue · 5 comments

Hi

I've tried to use the new auto-join feature in 1.6.1, and the issue is that my vault is running with TLS

my CSR looks like this

Name:               vault-csr
Signer:             kubernetes.io/legacy-unknown
Status:             Approved,Issued
Subject:
  Common Name:    vault.vault.svc
  Serial Number:
Subject Alternative Names:
         DNS Names:     vault
                        vault.vault
                        vault.vault.svc
                        vault.vault.svc.cluster.local
                        vault-0.vault-internal
                        vault-1.vault-internal
                        vault-2.vault-internal
                        localhost
         IP Addresses:  127.0.0.1

the current issue with the k8s provider here is that it returns the Pod IP address which is not in the SAN

addr := pod.Status.PodIP

my raft config is:

 storage "raft" {
            path = "/vault/data"
            retry_join {
              auto_join = "provider=k8s label_selector=\"app.kubernetes.io/name=vault,component=server\" namespace=\"vault\""
              auto_join_scheme = "https"
              leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault-ca.pem"
              leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
              leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            }
          }

If we could get the pod name with the service name like vault-0.vault-internal that would work

let me know if there a way to use autojoin with TLS for k8s

anyone?

Im facing to same problem,

auto_join = "provider=k8s ..." discovers IPs of pods however certificate SANs dose not have IPs (We cannot add random IPs to SAN as pod gets reschedule)

I think code needs to be rewritten where auto_join = "provider=k8s ..." will discover vault containers by names for statefulset eg: vault-0.vault-internal etc ... as combination of auto_join = "provider=k8s ..." and TLS will be useless

or we will need to use static config something like

          retry_join {
            leader_api_addr = "https://vault-0.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
            leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
            leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            auto_join_scheme = "https"
          }
          retry_join {
            leader_api_addr = "https://vault-1.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
            leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
            leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            auto_join_scheme = "https"
          }
          retry_join {
            leader_api_addr = "https://vault-2.vault-internal:8200"
            leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
            leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
            leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
            auto_join_scheme = "https"
          }

This is fixed,
hashicorp/vault#10698

use option leader_tls_servername = "vault"
Example:

retry_join {
  auto_join = "provider=k8s label_selector=\"app.kubernetes.io/name=vault,component=server\" namespace=\"{{ .Release.Namespace }}\""
  leader_tls_servername = "vault"
  leader_ca_cert_file = "/vault/userconfig/vault-server-tls/vault.ca"
  leader_client_key_file = "/vault/userconfig/vault-server-tls/vault.key"
  leader_client_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
  auto_join_scheme = "https"
}

Thanks.

my config:

    storage "raft" {
      path = "/vault/data"
      node_id = "HOSTNAME"
      retry_join {
        auto_join = "provider=k8s label_selector=\"app=hashicorp-vault,component=server\" namespace=\"VAULT_K8S_NAMESPACE\" "
        leader_tls_servername = "hashicorp-vault"
        auto_join_scheme = "https"
        leader_ca_cert_file = "/vault/cert/vault.ca"
        leader_client_key_file = "/vault/cert/vault.key"
        leader_client_cert_file = "/vault/cert/vault.crt"
      }
    }

result:

/ $ vault operator raft list-peers
Node                 Address                                            State       Voter
----                 -------                                            -----       -----
hashicorp-vault-0    hashicorp-vault-0.hashicorp-vault-internal:8201    leader      true
hashicorp-vault-1    hashicorp-vault-1.hashicorp-vault-internal:8201    follower    true
hashicorp-vault-2    hashicorp-vault-2.hashicorp-vault-internal:8201    follower    true

Thanks for the fix. I was stuck on this issue myself.