tink-crypto/tink

Envelope AEAD Malleability

iamyohann opened this issue · 2 comments

Regarding the document published here
https://developers.google.com/tink/issues/envelope-aead-malleability

Envelope encryption uses a third-party provider (such as GCP or AWS) to encrypt a data encryption key (DEK).
It is possible to modify certain parts of the encrypted DEK without detection when using KmsEnvelopeAead with AwsKmsAead or GcpKmsAead as the remote provider. This is due to the inclusion of unauthenticated metadata (for instance version numbers). Modifications to this unauthenticed data are not detected by the provider.
Note that this violates the adaptive chosen-ciphertext attack property (IND-CCA-2) for this interface, although the ciphertext will still decrypt to the correct DEK. When using this interface do not presume that each DEK only corresponds to a single encrypted DEK.

Could I understand what the actual risk is here in plain English?

  • Is there a risk to encryption/decryption?
  • Is this specifically related to authenticated data with envelope encryption (regarding authentication, we set the additional authentication data to blank, as we don't require authenticated data, just envelope encryption for large pieces of data - as KMS API as payload limits)
  • Would there be any failures to encrypt/decrypt due to this?
  • With reference to the word metadata are we talking about remote cloud providers introducing key metadata such as key version to an encrypted DEK, that allow the remote providers to determine which key version to use to decrypt the data?

I could be wrong, but if I understood this correctly, is the key risk here that authenticated data with AEAD can be an issue as there's no checks on the extra metadata remote key providers add to an encrypted DEK's cipher text?

If I'm understanding the internals of this scenario correctly, I presume this means a remote key provider may add metadata to the encrypted DEK cipher text (usually key version used to encrypt the raw DEK), however since Tink has no control over that metadata, there's no "authenticated data" checks on the DEK cipher text?

For Tink library consumers, does this just mean accepting the risk and trusting the remote key providers? And accepting the risk/trusting the metadata they inject into the DEK cipher-text?

If so, is this risk specific to authenticated data in envelope encryption (i.e. if we opt-out of authenticated data, with a 0 byte placeholder, the rest of AEAD is fine?)

Notes:
When I use the word metadata, I'm referring to this specific paragraph in the GCP KMS documentation

When a key is used to encrypt plaintext, its primary key version is used to encrypt that data. The information as to which version was used to encrypt data is stored in the ciphertext of the data. Only one version of a key can be primary at any given point in time.

https://cloud.google.com/kms/docs/key-states#symmetric_encryption

kste commented

The overall ciphertext, will have the form C = WRAPPED_DEK || AEAD_CIPHERTEXT = WRAP(KMS_KEY, DEK) || AEAD(DEK, "plaintext"). The malleability here, only applies to the WRAPPED_DEK part, which is done by a third party KMS. While key wrapping is often done with an AEAD, which would detect any modification, in practice we noticed that an external KMS might produce: WRAPPED_DEK = METADATA || AEAD(KMS_KEY, DEK), where (parts of) METADATA might not be authenticated by the third party KMS.

This is not necessarily a security issue, as METADATA could be something like the version of the wrapping mechanism, and hopefully the third party KMS does not have bad interaction with different wrapping algorithms. Modifying the WRAPPED_DEK does not allow an adversary to learn anything about the plaintext or modify any actual data, as long as the wrapping mechanism of the third party KMS is secure. I could see unauthenticated METADATA enable some more obscure scenarios though (see e.g. https://eprint.iacr.org/2020/1456), which could allow to ask for the wrong key version being used to unwrap. I am not aware though of this being possible with the KMSs we support in Tink.

The main risks here are, if someone assumes that:

  1. The envelope encryption ciphertext is a uniform random string. This cannot be guaranteed by Tink for the WRAPPED_DEK part, as the third party KMS wrapping the DEK might not uphold this property.
  2. Modifying the envelope encryption ciphertext, will always lead to a decryption failure. This is currently not guaranteed by Tink for the WRAPPED_DEK, as modifications to the wrapped DEK are only protected by the wrapping mechanism of the third party KMS.

In most practical scenarios, these properties will have little impact. In the case of 2), it could be an issue if e.g. one would have a denylist with hashes of "bad" envelope ciphertexts, then bypassing could be achieved by modifying bits in the WRAPPED_DEK part. In principle, 2) could be addressed by including the WRAPPED_DEK in the authenticated data part of the Tink AEAD, but this would be a breaking change and make the actual data encryption less standard.

Regarding, your question about authenticated data. The malleability concerns here are not with the authenticated data part, so whether you use this feature or not makes no difference. It solely affects the WRAPPED_DEK.

kste: can this be closed?