Explicit encryption required?
moritzploss opened this issue ยท 4 comments
First of all, thanks for this great tutorial! I learned a lot following along!
After going through the code several times, I'm wondering why we would need to explicitly encrypt/decrypt field values of our custom EncryptedField
type before saving them in the database. Isn't that why we define our custom Ecto Types to begin with, so that we don't need to encrypt/decrypt values manually and explicitly, but rather leave these tasks to the callbacks that we define?
For example, we use this function to explicitly encrypt the name and email field before insertion:
defp encrypt_fields(changeset) do
case changeset.valid? do
true ->
{:ok, encrypted_email} = EncryptedField.dump(changeset.data.email)
{:ok, encrypted_name} = EncryptedField.dump(changeset.data.name)
changeset
|> put_change(:email, encrypted_email)
|> put_change(:name, encrypted_name)
_ ->
changeset
end
end
From what I understand, both the email
and name
field are using our custom EncryptedField
Ecto type, and therefore the dump
callback defined in Encryption.EncryptedField
will be called just before the data is inserted into the database, encrypting the field values before saving them.
Similarly, we're explictly calling EncryptedField.load
on the retrieved value in User.one/0
:
def one() do
user =
%User{name: name, email: email, key_id: key_id, password_hash: password_hash} =
Repo.one(User)
{:ok, email} = EncryptedField.load(email, key_id)
{:ok, name} = EncryptedField.load(name, key_id)
%{user | email: email, name: name, password_hash: password_hash}
end
I guess we need this function so that we can explicitly specify the key_id
, but doesn't that mean that we're encrypting the value twice before inserting it into the database (once implicitly, once explicitly), and decrypting it twice (once implicitly, once explicitly) when retrieving it?
Thanks again for putting this tutorial together!
@moritzploss thanks for opening this issue and your great feedback, your observation is spot-on.
Yes, the dump
and load
functions will handle encryption and decryption for us.
If you have time, please create a Pull Request to update the instructions/code to reflect this.
(if not, don't worry, just the fact that you've opened this issue is awesome!) โ๏ธ
Great, I'll give it a go (hopefully soon)
So far I found that if the EncryptedField
should be self-contained, the key_id
would need to be accessible in EncryptedField.load/1
(and therefore in AES.decrypt/1
).
One way to achieve this would be to get rid of the key_id
field in the User schema and just store the ID as part of the encrypted binary. (I think this is also safer in case we make partial updates to a database entry).
So to encrypt:
Before
def encrypt(plaintext) do
iv = :crypto.strong_rand_bytes(16) # create random Initialisation Vector
key = get_key() # get the *latest* key in the list of encryption keys
{ciphertext, tag} =
:crypto.block_encrypt(:aes_gcm, key, iv, {@aad, to_string(plaintext), 16})
iv <> tag <> ciphertext # "return" iv with the cipher tag & ciphertext
end
After
def encrypt(plaintext) do
iv = :crypto.strong_rand_bytes(16)
key = get_key() # get latest key
key_id = get_key_id() # get latest ID; can be more elegant, but you get the idea
{ciphertext, tag} =
:crypto.block_encrypt(:aes_gcm, key, iv, {@aad, plaintext, 16})
iv <> tag <> <<key_id::unsigned-big-integer-32>> <> ciphertext
end
And to decrypt:
Before
def decrypt(ciphertext) do
<<iv::binary-16, tag::binary-16, ciphertext::binary>> = ciphertext
:crypto.block_decrypt(:aes_gcm, get_key(), iv, {@aad, ciphertext, tag})
end
After
def decrypt(ciphertext) do
<<iv::binary-16, tag::binary-16, key_id::unsigned-big-integer-32, ciphertext::binary>>
= ciphertext
:crypto.block_decrypt(:aes_gcm, get_key(key_id), iv, {@aad, ciphertext, tag})
end
With that we could get rid of all the explicit encryption code in the User schema, as well as AES.encrypt/2
, AES.decrypt/2
and EncryptedField.load/2
.
I don't know, do you think this is going in the right direction?
@moritzploss yeah, looking good so far. ๐