terraform-google-modules/terraform-google-vault

Potentially invalid listener configuration on LOCAL_IP

Closed this issue · 3 comments

TL;DR

I see in the Vault configuration there's a TLS listener on the instance's local IP address, configured with the certificate generated in the Terraform here. However, it seems from the list of IP SANs that the certificate isn't valid for the instance's local IP.

Expected behavior

Nothing has happened, I was looking at the module source and I noticed this, but I didn't know where to ask about how this works. Is that listener even used for anything? If so, how does it work? I see the listener's port is set to vault_port but the port in cluster_addr is hard-coded to 8201 here.

Observed behavior

No response

Terraform Configuration

There's no Terraform configuration because this is a question.

Terraform Version

Terraform v1.5.7

Additional information

Sorry if I should have asked this question somewhere else.

I was just looking at the Vault documentation again to see if there was anything I missed. It seems like they do the same type of thing in the example configuration (w.r.t. the port):

ui            = true
# cluster_addr specifies port 8201 but there's no listener defined on port 8201
cluster_addr  = "https://127.0.0.1:8201"
api_addr      = "https://127.0.0.1:8200"
disable_mlock = true

storage "raft" {
  path = "/path/to/raft/data"
  node_id = "raft_node_id"
}

listener "tcp" {
  address       = "127.0.0.1:8200"
  tls_cert_file = "/path/to/full-chain.pem"
  tls_key_file  = "/path/to/private-key.pem"
}

telemetry {
  statsite_address = "127.0.0.1:8125"
  disable_hostname = true
}

Another update with some results from experimentation:

I deployed two Vaults with the following identical configuation:

api_addr     = "https://vault.my.domain:8200"

# Enable the UI
ui = true

# Enable plugin directory
plugin_directory = "/etc/vault.d/plugins"

# Enable auto-unsealing with Google Cloud KMS
seal "gcpckms" {
  project    = "my-gcp-project"
  region     = "us-central1"
  key_ring   = "vault-keyring-Zc"
  crypto_key = "seal-gcpckms"
}

# Enable HA backend storage with GCS
storage "gcs" {
  bucket     = "my-vault-storage"
  ha_enabled = "true"
}

# Create local non-TLS listener
listener "tcp" {
  address     = "127.0.0.1:8201"
  tls_disable = 1
}

# Create an TLS listener on the load balancer address
listener "tcp" {
  address            = "0.0.0.0:8200"
  tls_cert_file      = "/etc/vault.d/tls/vault.crt"
  tls_key_file       = "/etc/vault.d/tls/vault.key"

  tls_disable_client_certs = "true"
}

Vault started up fine with this configuration.

I then SSH'ed into both instances. I ran vault operator init on one (call it instance A) while tailing the Vault logs on the other (call it instance B) and I saw these messages:

core: vault is unsealed
core: unsealed with stored key
core: entering standby mode
core: acquired lock, enabling active operation
core: post-unseal setup starting
core: loaded wrapping token key
core: successfully setup plugin runtime catalog
core: successfully setup plugin catalog: plugin-directory=/etc/vault.d/plugins
core: successfully mounted: type=system version="v1.17.0+builtin.vault" path=sys/ namespace="ID: root. Path: "
core: successfully mounted: type=identity version="v1.17.0+builtin.vault" path=identity/ namespace="ID: root. Path: "
core: successfully mounted: type=cubbyhole version="v1.17.0+builtin.vault" path=cubbyhole/ namespace="ID: root. Path: "
core: successfully mounted: type=token version="v1.17.0+builtin.vault" path=token/ namespace="ID: root. Path: "
rollback: Starting the rollback manager with 256 workers
core: restoring leases
rollback: starting rollback manager
expiration: lease restore complete
identity: entities restored
identity: groups restored
core: usage gauge collection is disabled
core: post-unseal setup complete

I then enabled the kv secrets engine on instance B using the root token I got from instance A after vault operator init:

vault secrets enable -version=1 kv
Success! Enabled the kv secrets engine at: kv/

vault secrets list
Path          Type         Accessor              Description
----          ----         --------              -----------
cubbyhole/    cubbyhole    cubbyhole_e14ce882    per-token private secret storage
identity/     identity     identity_8c153d34     identity store
kv/           kv           kv_6abef802           n/a
sys/          system       system_ce75d0de       system endpoints used for control, policy and debugging

However, when I try doing vault secrets list from instance A, it fails. When I run the curl command from vault secrets list -output-curl-string it gets stuck in a redirect loop. Makes sense since I saw cluster_addr set to api_addr.

After some more digging I finally found these docs and I also played around with the configurations a bit more and figured out some more quirks.

It seems like even though the config specifies cluster_addr = "https://LOCAL_IP:8201", the configuration of a listener on "LOCAL_IP:${vault_port}" is also necessary, and even though it's passed in TLS configuration, it doesn't actually get used according to this answer on the HashiCorp forums. However, when I was testing, if I removed the listener on LOCAL_IP:8200 the HA configuration would stop working.

I'm just gonna close this issue because even though the configuration is confusing, it seems like it's necessary in order for the module to deploy Vault properly.