Explicit encryption required?

Question

Explicit encryption required?

moritzploss opened this issue 4 years ago · 4 comments

First of all, thanks for this great tutorial! I learned a lot following along!

After going through the code several times, I'm wondering why we would need to explicitly encrypt/decrypt field values of our custom EncryptedField type before saving them in the database. Isn't that why we define our custom Ecto Types to begin with, so that we don't need to encrypt/decrypt values manually and explicitly, but rather leave these tasks to the callbacks that we define?

For example, we use this function to explicitly encrypt the name and email field before insertion:

defp encrypt_fields(changeset) do
  case changeset.valid? do
    true ->
      {:ok, encrypted_email} = EncryptedField.dump(changeset.data.email)
      {:ok, encrypted_name} = EncryptedField.dump(changeset.data.name)
      changeset
      |> put_change(:email, encrypted_email)
      |> put_change(:name, encrypted_name)
    _ ->
      changeset
  end
end

From what I understand, both the email and name field are using our custom EncryptedField Ecto type, and therefore the dump callback defined in Encryption.EncryptedField will be called just before the data is inserted into the database, encrypting the field values before saving them.

Similarly, we're explictly calling EncryptedField.load on the retrieved value in User.one/0:

def one() do
    user =
      %User{name: name, email: email, key_id: key_id, password_hash: password_hash} =
      Repo.one(User)

    {:ok, email} = EncryptedField.load(email, key_id)
    {:ok, name} = EncryptedField.load(name, key_id)
    %{user | email: email, name: name, password_hash: password_hash}
 end

I guess we need this function so that we can explicitly specify the key_id, but doesn't that mean that we're encrypting the value twice before inserting it into the database (once implicitly, once explicitly), and decrypting it twice (once implicitly, once explicitly) when retrieving it?

Thanks again for putting this tutorial together!

Answer 1 · 2020-04-11T18:04:13.000Z

@moritzploss thanks for opening this issue and your great feedback, your observation is spot-on.
Yes, the dump and load functions will handle encryption and decryption for us.
If you have time, please create a Pull Request to update the instructions/code to reflect this.
(if not, don't worry, just the fact that you've opened this issue is awesome!) ☀️

Answer 2 · 2020-04-14T16:54:25.000Z

Great, I'll give it a go (hopefully soon)

Answer 3 · 2020-04-15T17:10:51.000Z

So far I found that if the EncryptedField should be self-contained, the key_id would need to be accessible in EncryptedField.load/1 (and therefore in AES.decrypt/1).

One way to achieve this would be to get rid of the key_id field in the User schema and just store the ID as part of the encrypted binary. (I think this is also safer in case we make partial updates to a database entry).

So to encrypt:

Before

def encrypt(plaintext) do
    iv = :crypto.strong_rand_bytes(16) # create random Initialisation Vector
    key = get_key()    # get the *latest* key in the list of encryption keys
    {ciphertext, tag} =
      :crypto.block_encrypt(:aes_gcm, key, iv, {@aad, to_string(plaintext), 16})
    iv <> tag <> ciphertext # "return" iv with the cipher tag & ciphertext
end

After

def encrypt(plaintext) do
    iv = :crypto.strong_rand_bytes(16)
    key = get_key() # get latest key
    key_id = get_key_id() # get latest ID; can be more elegant, but you get the idea
    {ciphertext, tag} =
        :crypto.block_encrypt(:aes_gcm, key, iv, {@aad, plaintext, 16})
    iv <> tag <> <<key_id::unsigned-big-integer-32>> <> ciphertext
end

And to decrypt:

Before

def decrypt(ciphertext) do
    <<iv::binary-16, tag::binary-16, ciphertext::binary>> = ciphertext
    :crypto.block_decrypt(:aes_gcm, get_key(), iv, {@aad, ciphertext, tag})
end

After

def decrypt(ciphertext) do
    <<iv::binary-16, tag::binary-16, key_id::unsigned-big-integer-32, ciphertext::binary>>
        = ciphertext
    :crypto.block_decrypt(:aes_gcm, get_key(key_id), iv, {@aad, ciphertext, tag})
end

With that we could get rid of all the explicit encryption code in the User schema, as well as AES.encrypt/2, AES.decrypt/2 and EncryptedField.load/2.

I don't know, do you think this is going in the right direction?

Answer 4 · 2020-04-15T17:18:20.000Z

@moritzploss yeah, looking good so far. 👍