talegari/safer

Creating shorter encrypted strings

Closed this issue · 1 comments

Is it possible to control the length of the encrypted string? For example testing on a vector of 100 strings with 20 characters each, the encrypted strings are 48 characters each which can add substantial size to large data sets with millions or billions of observations.

Yes, this is an issue for large data sets with substantial number of unique entries.

Currently, the last step in encoding is via base64encode.

If there exists an encoding that results in shorter encoded strings while providing one-to-one mappings, optionally we could apply that on the output and shorten it at the cost of computation time. I am not aware of such a compression library and an R implementation/API thereof.

There are some articles pointing towards the fact that base64 encoding is optimal in some sense.