external-secrets/kubernetes-external-secrets

Unexpected HashiCorp Vault Batch or "Non Renewable" Tokens Issue

Closed this issue · 4 comments

c-bx commented

Hello,

I am running tests in a bare minimum minikube (v1.19.0) deployed Kubernetes cluster on a RHEL VM. Only to test this integration, External Secrets, and many others for retrieving secrets from a backend Vault service. minikube start --vm-driver=none and installing the helm chart for External Secrets with the edited values.yaml for the Vault backend:

Default values for kubernetes-external-secrets.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

# Environment variables to set on deployment pod
env:
  POLLER_INTERVAL_MILLISECONDS: 60000
  VAULT_ADDR: <ADDR>
  VAULT_NAMESPACE: <NAMESPACE>
  DEFAULT_VAULT_MOUNT_POINT: "<MOUNT>"
  LOG_LEVEL: "debug"
  DEFAULT_VAULT_ROLE: "<ROLE>"
  METRICS_PORT: 3001

ExternalSecrets CRD:

apiVersion: 'kubernetes-client.io/v1'
kind: ExternalSecret
metadata:
  name: hello-vault-service
spec:
  backendType: vault
  data:
    - name: secrets
      key: <SECREST_PATH>
      parameter: <KEY>

Upon successfully installation, I then update the Vault Kubernetes auth config for the appropriate service account JWT Reviewer and CA CERT. The Vault role is defined to be of token_type "batch" and with a token_ttl of 5 minutes. KV Version 2.

Expected Outcome: Taking into consideration of the 1 minute POLLER_INTERVAL_MILLISECONDS, the renewal attempt of the batch token should happen when the remaining ttl is under/equal to the 3 minute mark. The attempt at renewing the Batch Token should fail since Batch Tokens are non renewable, then the client.token=null assignment takes place and a new authentication flow executes. After authenticating for a newly generated batch token, it should read the secrets successfully. This should repeat.

Actual Outcome: All of the above takes place until the batch token goes through the renewal attempt. The client.token=null is set and the authentication flow does execute successfully (confirmed in the vault backend). However, I am not sure as to why when the read secrets call is made Vault returns permission denied. I do know that the client_token that Vault backend sees during this failed read secrets call is the first original retrieved batch token and not the newly generated one. I proceeded to see if the client.token is set to the newly generated one on the next poll, it is not. An error will happen with the token lookup/renewal calls and then go through a new authentication only to continue this failed loop. This has a bad side effect on the Vault backend, newly generated tokens at the rate of the poller interval. This can lead to a build up of memory consumption/processing and then fail over the Vault service.

I took the testing idea of "non renewable" tokens to the Service Token type which by default is renewable. During the Vault role setup, I set num_uses=1 for the service token. The same issues with the batch token occurs here as well.

A fix that I had in place and I am by no means a good JavaScript developer. I took out the usage of the client object (believe it is leveraging node-vault from HashiCorp). Set the map data type to store just the clientToken and wrote the read secrets, lookup, renew and authentication calls within vault-backend. There are a lot of logs in place only for my sanity check. It works as intended for Batch Tokens and I haven't ran into issues yet*

Please let me know if you have any questions or need additional information. Thank you for your time! Appreciate it.

'use strict'

const KVBackend = require('./kv-backend')
const axios = require('axios')

/** Vault backend class. */
class VaultBackend extends KVBackend {
  /**
   * Create Vault backend.
   * @param {Object} vaultFactory - arrow function to create a vault client.
   * @param {Number} tokenRenewThreshold - tokens are renewed when ttl reaches this threshold
   * @param {Object} logger - Logger for logging stuff.
   */
  constructor ({ vaultFactory, tokenRenewThreshold, logger, defaultVaultMountPoint, defaultVaultRole, defaultEndpoint, defaultNamespace }) {
    super({ logger })
    this._vaultFactory = vaultFactory
    this._clientTokens = new Map()
    this._tokenRenewThreshold = tokenRenewThreshold
    this._defaultVaultMountPoint = defaultVaultMountPoint
    this._defaultVaultRole = defaultVaultRole
    this._defaultEndpoint = defaultEndpoint
    this._defaultNamespace = defaultNamespace
  }

  /**
   * Fetch Kubernetes service account token.
   * @returns {string} String representing the token of the service account running this pod.
   */
  _fetchServiceAccountToken () {
    if (!this._serviceAccountToken) {
      const fs = require('fs')
      this._serviceAccountToken = fs.readFileSync('/var/run/secrets/kubernetes.io/serviceaccount/token', 'utf8')
    }
    return this._serviceAccountToken
  }

  async _authenticate (vaultMountPoint, vaultRole) {
    const token = this._fetchServiceAccountToken()
    const url = `${this._defaultEndpoint}/v1/auth/${vaultMountPoint}/login`
    const kuberentesData = {
      role: vaultRole,
      jwt: token
    }
    const headers = {
      'Content-Type': 'application/json',
      'X-Vault-Namespace': `${this._defaultNamespace}`
    }
    let response = await axios.post(url, kuberentesData, {
      headers: headers
    })
    let clientToken = response.data.auth.client_token
    this._logger.debug(`AUTHENTICATE: ${JSON.stringify(clientToken)}`)
    return clientToken
  }

  async _getSecrets (clientToken, key) {
    const url = `${this._defaultEndpoint}/v1/${key}`

    const headers = {
      'Content-Type': 'application/json',
      'X-Vault-Namespace': `${this._defaultNamespace}`,
      'X-Vault-Token': `${clientToken}`
    }

    let response = await axios.get(url, {
      headers: headers
    })
    let data = response.data
    this._logger.debug(`SECRETS: ${JSON.stringify(data)}`)
    return data
  }

  async _tokenLookupSelf (clientToken) {
    const url = `${this._defaultEndpoint}/v1/auth/token/lookup-self`

    const headers = {
      'Content-Type': 'application/json',
      'X-Vault-Namespace': `${this._defaultNamespace}`,
      'X-Vault-Token': `${clientToken}`
    }

    let response = await axios.get(url, {
      headers: headers
    })
    let data = response.data
    this._logger.debug(`LOOKUP: ${JSON.stringify(data)}`)
    return data
  }

  async _tokenRenewSelf (clientToken) {
    const url = `${this._defaultEndpoint}/v1/auth/token/renew-self`
    
    // make this dynamic, can be done to stay inline with the set token_ttl value
    const kubernetesData = {
      increment: "5m"
    }

    const headers = {
      'Content-Type': 'application/json',
      'X-Vault-Namespace': `${this._defaultNamespace}`,
      'X-Vault-Token': `${clientToken}`
    }

    let response = await axios.post(url, kubernetesData, {
      headers: headers
    })
    let responseData = response.data
    this._logger.debug(`RENEW: ${JSON.stringify(responseData)}`)
    return true
  }

  /**
   * Get secret property value from Vault.
   * @param {string} key - Secret key in the backend.
   * @param {object} keyOptions - Options for this specific key, eg version etc.
   * @param {object} specOptions - Options for this external secret, eg role
   * @param {string} specOptions.vaultMountPoint - mount point
   * @param {string} specOptions.vaultRole - role
   * @param {number} specOptions.kvVersion - K/V Version 1 or 2
   * @returns {Promise} Promise object representing secret property values.
   */
  async _get ({ key, specOptions: { vaultMountPoint = null, vaultRole = null, kvVersion = 2 } }) {
    const vaultMountPointGet = vaultMountPoint || this._defaultVaultMountPoint
    const vaultRoleGet = vaultRole || this._defaultVaultRole
    const tokenCacheKey = `|m${vaultMountPointGet}|r${vaultRoleGet}|`
    let clientToken = this._clientTokens.get(tokenCacheKey)
    this._logger.debug(`CACHED: ${clientToken}`)

    if (typeof clientToken == 'undefined' || !clientToken) {
      clientToken = await this._authenticate(vaultMountPointGet, vaultRoleGet)
      if (typeof clientToken !== 'undefined') {
        this._clientTokens.set(tokenCacheKey, clientToken)
        this._logger.debug(`CACHED: ${clientToken}`)
      } else {
        clientToken = null
      }
    }
    
    if (typeof clientToken !== 'undefined' && clientToken) {
      try {
        const secretResponse = await this._getSecrets(clientToken, key)

        const tokenStatus = await this._tokenLookupSelf(clientToken)
        this._logger.debug(`vault token (role ${vaultRoleGet} on ${vaultMountPointGet}) valid for ${tokenStatus.data.ttl} seconds, renews at ${this._tokenRenewThreshold}`)

        if (Number(tokenStatus.data.ttl) <= this._tokenRenewThreshold) {
          this._logger.debug(`renewing role ${vaultRoleGet} on ${vaultMountPointGet} vault token ${clientToken}`)

          if (!(await this._tokenRenewSelf(clientToken))) { 
            this._logger.debug(`cached token renewal failed.  Clearing cached token for role ${vaultRoleGet} on ${vaultMountPointGet}`)
            clientToken = null
          }
        }

        if (kvVersion === 1) {
          return JSON.stringify(secretResponse.data)
        }
    
        if (kvVersion === 2) {
          return JSON.stringify(secretResponse.data.data)
        }
    
        throw new Error('Unknown "kvVersion" specified')
      } catch {
        this._logger.debug(`cached token operation failed.  Clearing cached token for role ${vaultRoleGet} on ${vaultMountPointGet}`)
        clientToken = null
        this._clientTokens.set(tokenCacheKey, clientToken)
      }
    }
  }
}

module.exports = VaultBackend

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.

I am facing very similar issue, but we are not using batch tokens

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.

This issue was closed because it has been stalled for 30 days with no activity.