defrex/django-encrypted-fields

Consistent encryption output, possible?

sandervm opened this issue · 2 comments

Hi there,

When I am encrypting a certain value, keyczar keeps generating a different output.

>>> crypter = Crypter.Read(settings.ENCRYPTED_FIELDS_KEYDIR)
>>> crypter.Encrypt('fooo')
'ALDUpBUMjbaF-cQ7Eq9MTaFASmg1fC5SEgdqtBxcHxNsY_H0gYKAB_x-dT-_2K_f_AqW4fpCsgdl'
>>> crypter.Encrypt('fooo')
'ALDUpBVfCJcWvZ9sEfjJpQyWHvnJDeyii_LDSr93G9FqM0QR1-5p_w3uinNwI4OxYJYAi-9SeeJY'

This is of course very secure, but I am wondering if it's possible to make it output the same cipher text the same for every time I call it? This because I am trying to use your fields in combination with Django's custom user model. This now fails because the stored value never matches the generated one.

The short answer is no. This is a feature of Keyczar and is very much intentional and needed. Loosely, some randomness is added to your plaintext before it's encrypted, and striped away after it's unencrypted. The procedure is a requirement for strong encryption.

For more information, see Initialization Vector.

Ultimately, you can't query against encrypted data. You may be able to work around the problem with hashing though. For example, if your user provides an email and password for authentication, but you want the email to be stored encrypted in your database, you could store a hashed version of the email in addition to the encrypted one. Then you can reproduce the hash and query against that, and only decrypt the email once you've found your user. That's the same way Django stores passwords.

Hope that helped.

Thanks for your reply defrex! I like the nice background information!

And your provided workaround seems like a very good idea indeed, wasn't aware Django already did it this way... I will certainly look into this

Thanks again, keep up the good work!