/terraform-google-vault

Modular deployment of Vault on Google Compute Engine with Terraform

Primary LanguageHCLApache License 2.0Apache-2.0

Vault on GCE Terraform Module

Modular deployment of Vault on Google Compute Engine.

This module is versioned and released on the Terraform module registry. Look for the tag that corresponds to your version for the correct documentation.

  • Vault HA - Vault is configured to run in high availability mode with Google Cloud Storage. Choose a vault_min_num_servers greater than 0 to enable HA mode.

  • Production hardened - Vault is deployed according to applicable parts of the production hardening guide.

    • Traffic is encrypted with end-to-end TLS using self-signed certificates which can be generated or supplied (see Managing TLS below).

    • Vault is the main process on the VMs, and Vault runs as an unprivileged user (vault:vault) on the VMs under systemd.

    • Outgoing Vault traffic happens through a restricted NAT gateway through dedicated IPs for logging and monitoring. You can further restrict outbound access with additional firewall rules.

    • The Vault nodes are not publicly accessible. They do have SSH enabled, but require a bastion host on their dedicated network to access. You can disable SSH access entirely by setting ssh_allowed_cidrs to the empty list.

    • Swap is disabled (the default on all GCE VMs), reducing the risk that in-memory data will be paged to disk.

    • Core dumps are disabled.

    The following values do not represent Vault's best practices and you may wish to change their defaults:

    • Vault is publicly accessible from any IP through the load balancer. To limit the list of source IPs that can communicate with Vault nodes, set vault_allowed_cidrs to a list of CIDR blocks.

    • Auditing is not enabled by default, because an initial bootstrap requires you to initialize the Vault. Everything is pre-configured for when you're ready to enable audit logging, but it cannot be enabled before Vault is initialized.

    • Auto-unseal - Vault is automatically unsealed using the built-in Vault 1.0+ auto-unsealing mechanisms for Google Cloud KMS. The Vault servers are not automatically initialized, providing a clear separation.

    • Isolation - The Vault nodes are not exposed publicly. They live in a private subnet with a dedicated NAT gateway.

    • Audit logging - The system is setup to accept Vault audit logs with a single configuration command. Vault audit logs are not enabled by default because you have to initialize the system first.

Usage

  1. Add the module definition to your Terraform configurations:

    module "vault" {
      source         = "terraform-google-modules/vault/google"
      project_id     = var.project_id
      region         = var.region
      kms_keyring    = var.kms_keyring
      kms_crypto_key = var.kms_crypto_key
    }

    Make sure you are using version pinning to avoid unexpected changes when the module is updated.

  2. Execute Terraform:

    $ terraform apply
    
  3. Configure your local Vault binary to communicate with the Vault server:

    $ export VAULT_ADDR="$(terraform output vault_addr)"
    $ export VAULT_CACERT="$(pwd)/ca.crt"
    
  4. Wait for Vault to start. Here's a script or you can wait ~2 minutes.

    (while [[ $count -lt 60 && "$(vault status 2>&1)" =~ "connection refused" ]]; do ((count=count+1)) ; echo "$(date) $count: Waiting for Vault to start..." ; sleep 2; done && [[ $count -lt 60 ]])
    [[ $? -ne 0 ]] && echo "ERROR: Error waiting for Vault to start" && exit 1
    
  5. Initialize the Vault cluster, generating the initial root token and unseal keys:

    $ vault operator init \
        -recovery-shares 5 \
        -recovery-threshold 3
    

    The Vault servers will automatically unseal using the Google Cloud KMS key created earlier. The recovery shares are to be given to operators to unseal the Vault nodes in case Cloud KMS is unavailable in a disaster recovery. They can also be used to generate a new root token. Distribute these keys to trusted people on your team (like people who will be on-call and responsible for maintaining Vault).

    The output will look like this:

    Recovery Key 1: 2EWrT/YVlYE54EwvKaH3JzOGmq8AVJJkVFQDni8MYC+T
    Recovery Key 2: 6WCNGKN+dU43APJuGEVvIG6bAHA6tsth5ZR8/bJWi60/
    Recovery Key 3: XC1vSb/GfH35zTK4UkAR7okJWaRjnGrP75aQX0xByKfV
    Recovery Key 4: ZSvu2hWWmd4ECEIHj/FShxxCw7Wd2KbkLRsDm30f2tu3
    Recovery Key 5: T4VBvwRv0pkQLeTC/98JJ+Rj/Zn75bLfmAaFLDQihL9Y
    
    Initial Root Token: s.kn11NdBhLig2VJ0botgrwq9u
    

    Save this initial root token and do not clear your history. You will need this token to continue the tutorial.

Managing TLS

If, like many orgs, you manage your own self-signed TLS certificates, you likely will not want them managed by Terraform. Additionally this poses a security risk since the private keys will be stored in plaintext in the terraform.tfstate file. To use your own certificates, set manage_tls = false. Then before you apply this module, you'll need to have your certificates prepared. An example instantiation would look like this:

module "vault" {
  source  = "terraform-google-modules/vault/google"
  ...

  # Manage our own TLS Certs so the private keys don't
  # end up in Terraform state
  manage_tls        = false
  vault_tls_bucket  = google_storage_bucket.vault_tls.name
  vault_tls_kms_key = google_kms_crypto_key.vault_tls.self_link

  # These are the default values shown here for clarity
  vault_ca_cert_filename  = "ca.crt"
  vault_tls_cert_filename = "vault.crt"
  vault_tls_key_filename  = "vault.key.enc"
}

To store your keys, you’ll need to create at minimum the 3 files that are shown above at the root of the TLS bucket specified by vault_tls_bucket, but their filenames and paths can be overridden using the vault_*_filename variables shown above.

  • CA Certificate. This file should be the PEM formatted CA certificate that the Vault server certificate is created from
  • Vault Server Certificate. This file should correspond to the Vault Private Key also stored on the Vault hosts to terminate the TLS connection.
  • Vault Private Key. As you’ll notice, this key has the .enc file extension denoting it is encrypted. When the Vault host spins up, it will fetch all these certificates and the key and on the key it will use the specified TLS KMS Key to Base64 decode and then decrypt the private key before storing it on the filesystem.

Assuming you have these files locally that have been generated by OpenSSL or some other CA, you can store them with the following commands:

gcloud kms encrypt \
  --project=${PROJECT} \
  --key=${KMS_KEY} \
  --plaintext-file=vault.key \
  --ciphertext-file=- | base64 > "vault.key.enc"

for file in vault.key.enc ca.crt vault.crt; do
  gsutil cp $file gs://$TLS_BUCKET/$file
done

Inputs

Name Description Type Default Required
allow_public_egress Whether to create a NAT for external egress. If false, you must also specify an http_proxy to download required executables including Vault, Fluentd and Stackdriver bool true no
allow_ssh Allow external access to ssh port 22 on the Vault VMs. It is a best practice to set this to false, however it is true by default for the sake of backwards compatibility. bool true no
domain The domain name that will be set in the api_addr. Load Balancer IP used by default string "" no
http_proxy HTTP proxy for downloading agents and vault executable on startup. Only necessary if allow_public_egress is false. This is only used on the first startup of the Vault cluster and will NOT set the global HTTP_PROXY environment variable. i.e. If you configure Vault to manage credentials for other services, default HTTP routes will be taken. string "" no
kms_crypto_key The name of the Cloud KMS Key used for encrypting initial TLS certificates and for configuring Vault auto-unseal. Terraform will create this key. string "vault-init" no
kms_keyring Name of the Cloud KMS KeyRing for asset encryption. Terraform will create this keyring. string "vault" no
kms_protection_level The protection level to use for the KMS crypto key. string "software" no
load_balancing_scheme Options are INTERNAL or EXTERNAL. If EXTERNAL, the forwarding rule will be of type EXTERNAL and a public IP will be created. If INTERNAL the type will be INTERNAL and a random RFC 1918 private IP will be assigned string "EXTERNAL" no
manage_tls Set to false if you'd like to manage and upload your own TLS files. See Managing TLS for more details bool true no
network The self link of the VPC network for Vault. By default, one will be created for you. string "" no
network_subnet_cidr_range CIDR block range for the subnet. string "10.127.0.0/20" no
project_id ID of the project in which to create resources and add IAM bindings. string n/a yes
project_services List of services to enable on the project where Vault will run. These services are required in order for this Vault setup to function. list(string)
[
"cloudkms.googleapis.com",
"cloudresourcemanager.googleapis.com",
"compute.googleapis.com",
"iam.googleapis.com",
"logging.googleapis.com",
"monitoring.googleapis.com"
]
no
region Region in which to create resources. string "us-east4" no
service_account_name Name of the Vault service account. string "vault-admin" no
service_account_project_additional_iam_roles List of custom IAM roles to add to the project. list(string) [] no
service_account_project_iam_roles List of IAM roles for the Vault admin service account to function. If you need to add additional roles, update service_account_project_additional_iam_roles instead. list(string)
[
"roles/logging.logWriter",
"roles/monitoring.metricWriter",
"roles/monitoring.viewer"
]
no
service_account_storage_bucket_iam_roles List of IAM roles for the Vault admin service account to have on the storage bucket. list(string)
[
"roles/storage.legacyBucketReader",
"roles/storage.objectAdmin"
]
no
service_label The service label to set on the internal load balancer. If not empty, this enables internal DNS for internal load balancers. By default, the service label is disabled. This has no effect on external load balancers. string null no
ssh_allowed_cidrs List of CIDR blocks to allow access to SSH into nodes. list(string)
[
"0.0.0.0/0"
]
no
storage_bucket_class Type of data storage to use. If you change this value, you will also need to choose a storage_bucket_location which matches this parameter type string "MULTI_REGIONAL" no
storage_bucket_enable_versioning Set to true to enable object versioning in the GCS bucket.. You may want to define lifecycle rules if you want a finite number of old versions. string false no
storage_bucket_force_destroy Set to true to force deletion of backend bucket on terraform destroy string false no
storage_bucket_lifecycle_rules Vault storage lifecycle rules
list(object({
action = map(object({
type = string,
storage_class = string
})),
condition = map(object({
age = number,
created_before = string,
with_state = string,
is_live = string,
matches_storage_class = string,
num_newer_versions = number
}))
}))
[] no
storage_bucket_location Location for the Google Cloud Storage bucket in which Vault data will be stored. string "us" no
storage_bucket_name Name of the Google Cloud Storage bucket for the Vault backend storage. This must be globally unique across of of GCP. If left as the empty string, this will default to: '-vault-data'. string "" no
subnet The self link of the VPC subnetwork for Vault. By default, one will be created for you. string "" no
tls_ca_subject The subject block for the root CA certificate.
object({
common_name = string,
organization = string,
organizational_unit = string,
street_address = list(string),
locality = string,
province = string,
country = string,
postal_code = string,
})
{
"common_name": "Example Inc. Root",
"country": "US",
"locality": "The Intranet",
"organization": "Example, Inc",
"organizational_unit": "Department of Certificate Authority",
"postal_code": "95559-1227",
"province": "CA",
"street_address": [
"123 Example Street"
]
}
no
tls_cn The TLS Common Name for the TLS certificates string "vault.example.net" no
tls_dns_names List of DNS names added to the Vault server self-signed certificate list(string)
[
"vault.example.net"
]
no
tls_ips List of IP addresses added to the Vault server self-signed certificate list(string)
[
"127.0.0.1"
]
no
tls_ou The TLS Organizational Unit for the TLS certificate string "IT Security Operations" no
tls_save_ca_to_disk Save the CA public certificate on the local filesystem. The CA is always stored in GCS, but this option also saves it to the filesystem. bool true no
user_startup_script Additional user-provided code injected after Vault is setup string "" no
vault_allowed_cidrs List of CIDR blocks to allow access to the Vault nodes. Since the load balancer is a pass-through load balancer, this must also include all IPs from which you will access Vault. The default is unrestricted (any IP address can access Vault). It is recommended that you reduce this to a smaller list. list(string)
[
"0.0.0.0/0"
]
no
vault_args Additional command line arguments passed to Vault server string "" no
vault_ca_cert_filename GCS object path within the vault_tls_bucket. This is the root CA certificate. string "ca.crt" no
vault_instance_base_image Base operating system image in which to install Vault. This must be a Debian-based system at the moment due to how the metadata startup script runs. string "debian-cloud/debian-10" no
vault_instance_labels Labels to apply to the Vault instances. map(string) {} no
vault_instance_metadata Additional metadata to add to the Vault instances. map(string) {} no
vault_instance_tags Additional tags to apply to the instances. Note 'allow-ssh' and 'allow-vault' will be present on all instances. list(string) [] no
vault_log_level Log level to run Vault in. See the Vault documentation for valid values. string "warn" no
vault_machine_type Machine type to use for Vault instances. string "e2-standard-2" no
vault_max_num_servers Maximum number of Vault server nodes to run at one time. The group will not autoscale beyond this number. string "7" no
vault_min_num_servers Minimum number of Vault server nodes in the autoscaling group. The group will not have less than this number of nodes. string "1" no
vault_port Numeric port on which to run and expose Vault. string "8200" no
vault_proxy_port Port to expose Vault's health status endpoint on over HTTP on /. This is required for the health checks to verify Vault's status is using an external load balancer. Only the health status endpoint is exposed, and it is only accessible from Google's load balancer addresses. string "58200" no
vault_tls_bucket GCS Bucket override where Vault will expect TLS certificates are stored. string "" no
vault_tls_cert_filename GCS object path within the vault_tls_bucket. This is the vault server certificate. string "vault.crt" no
vault_tls_disable_client_certs Use client certificates when provided. You may want to disable this if users will not be authenticating to Vault with client certificates. string false no
vault_tls_key_filename Encrypted and base64 encoded GCS object path within the vault_tls_bucket. This is the Vault TLS private key. string "vault.key.enc" no
vault_tls_kms_key Fully qualified name of the KMS key, for example, vault_tls_kms_key = "projects/PROJECT_ID/locations/LOCATION/keyRings/KEYRING/cryptoKeys/KEY_NAME". This key should have been used to encrypt the TLS private key if Terraform is not managing TLS. The Vault service account will be granted access to the KMS Decrypter role once it is created so it can pull from this the vault_tls_bucket at boot time. This option is required when manage_tls is set to false. string "" no
vault_tls_kms_key_project Project ID where the KMS key is stored. By default, same as project_id string "" no
vault_tls_require_and_verify_client_cert Always use client certificates. You may want to disable this if users will not be authenticating to Vault with client certificates. string false no
vault_ui_enabled Controls whether the Vault UI is enabled and accessible. string true no
vault_update_policy_type Options are OPPORTUNISTIC or PROACTIVE. If PROACTIVE, the instance group manager proactively executes actions in order to bring instances to their target versions string "OPPORTUNISTIC" no
vault_version Version of vault to install. This version must be 1.0+ and must be published on the HashiCorp releases service. string "1.6.0" no

Outputs

Name Description
ca_cert_pem CA certificate used to verify Vault TLS client connections.
ca_key_pem Private key for the CA.
service_account_email Email for the vault-admin service account.
vault_addr Full protocol, address, and port (FQDN) pointing to the Vault load balancer.This is a drop-in to VAULT_ADDR: export VAULT_ADDR="$(terraform output vault_addr)". And then continue to use Vault commands as usual.
vault_lb_addr Address of the load balancer without port or protocol information. You probably want to use vault_addr.
vault_lb_port Port where Vault is exposed on the load balancer.
vault_nat_ips The NAT-ips that the vault nodes will use to communicate with external services.
vault_network The network in which the Vault cluster resides
vault_storage_bucket GCS Bucket Vault is using as a backend/database
vault_subnet The subnetwork in which the Vault cluster resides

Additional permissions

The default installation includes the most minimal set of permissions to run Vault. Certain plugins may require more permissions, which you can grant to the service account using service_account_project_additional_iam_roles:

GCP auth method

The GCP auth method requires the following additional permissions:

roles/iam.serviceAccountKeyAdmin

GCP secrets engine

The GCP secrets engine requires the following additional permissions:

roles/iam.serviceAccountKeyAdmin
roles/iam.serviceAccountAdmin

GCP KMS secrets engine

The GCP secrets engine permissions vary. There are examples in the secrets engine documentation.

Logs

The Vault server logs will automatically appear in Stackdriver under "GCE VM Instance" tagged as "vaultproject.io/server".

The Vault audit logs, once enabled, will appear in Stackdriver under "GCE VM Instance" tagged as "vaultproject.io/audit".

Sandboxes & Terraform Cloud

When running in a sandbox such as Terraform Cloud, you need to disable filesystem access. You can do this by setting the following variables:

# terraform.tfvars
tls_save_ca_to_disk = false

FAQ

  • I see unhealthy Vault nodes in my load balancer pool!

    This is the expected behavior. Only the active Vault node is added to the load balancer to prevent redirect loops. If that node loses leadership, its health check will start failing and a standby node will take its place in the load balancer.

  • Can I connect to the Vault nodes directly?

    Connecting to the vault nodes directly is not recommended, even if on the same network. Always connect through the load balancer. You can alter the load balancer to be an internal-only load balancer if needed.