If you do not want to read thousands of words of text here is a brief of features that hpenc
provides:
- You want to transfer data between your servers quickly. That could be terabytes of data, but your NIC
is powerful and can transfer up to 10G. Your current options are:
ssh
- a good choice, but not very fast and it cannot use more than 1 CPUgpg
- slow, old and requires complex keys generationopenssl enc
- no integrity check, no multi-core support
For hpenc
you can use all your CPU cores and burn your network as fast as possible. All data passed is
authenticated meaning that all corruption will be detected immediately. That is more like encryption +
digest, but much much faster. On the other hand, hpenc
utilizes the modern cryptographic ciphers and tries
to use hardware acceleration if possible.
I understand that external dependencies might be evil for servers transferring, hence, hpenc
depends only on openssl
and sodium
libraries and C++11
compiler which is likely available in all operating system.
-
You want to store like 5Tb of movies on some backup space available via
ftp
. You do not want anybody but you to watch those movies.hpenc
allows you to encrypt and authenticate your backup quickly. Of course, you need some extra space for storing authentication data but it is negligible in terms of the overall size (about 16K for each gigabyte of data). In this case you still do not need to create som complex atchitecture of keys: just generate a key and store it in some secure place. -
You need some fast entropy generator. Of course, you could use
rand()
function from your libc. However, it is very insecure if you want cryptographic input.hpenc
provides secure pseudo-random data on a very high speed. In fact, you can generate up to 2 GB of random numbers per second on a modern hardware (Inter Core i7 SandyBridge)! Comparing to 14 Mb per second for reading from/dev/urandom
it is really fast (but still secure).
If you are still not convinced here is a list of features provided by hpenc
:
- Authenticated encryption - your data cannot be forged or corrupted without detection.
- Parallel processing -
hpenc
uses block IO and you can process multiple blocks simultaneously, which is extremely useful if you have multi-core environment. - Strong ciphers -
hpenc
uses the state-of-artaes-gcm
andchacha20
ciphers in counter-like mode. This provides up to 2^248 complexity to break the cipher 1 (2^128 complexity is considered as totally secure in pre-quantum-computer world). - Easy interface -
hpenc < in > out
: what could be more simple? - Hardware acceleration - do you have the modern CPU?
hpenc
can utilize all its advanced cryptography functions defined forAES-NI
andPCLMULQDQ
instructions (that must be supported by openssl). For those with old or embedded CPU (such as ARM),hpenc
provides portable and fastchacha20
cipher. - Simple key management - all that you need to remember/transfer is 52 symbols shared key. All these symbols are pronounceable meaning that you can tell them using a phone or writing them down on a sheet of paper without worrying about
o
and0
ambiguity. - (Almost) zero dependencies -
hpenc
requires onlylibcrypto
from openssl >= 1.0.1d and C++11 compatible compiler: gcc 4.7+ or clang 3.3+. If you use some punny system that does not satisfy these requirements, than you don't care about performance anyway. - Secure random numbers generator -
hpenc
can work as pseudo-random numbers generator. In a set of standard tests (diehard) on the generated sequenceshpenc
generates secure sequences of pseudo-random numbers on a very high speed (gigabytes per second).
Generate PSK:
hpenc psk
Encrypt data:
echo 'data' | hpenc -k 8jc38bntqehs31f3q8j4du4ry88k34ugh6eux6aoggpkbywgok9y > encrypted
Decrypt data:
hpenc -k 8jc38bntqehs31f3q8j4du4ry88k34ugh6eux6aoggpkbywgok9y -d < encrypted
Run as random number generator:
hpenc -r -b 1M -c 10 > random
Securely reset all data on your hard drive:
hpenc -r -b 1M > /dev/hda
Move data over the network (using bar utility):
bar -b 16M -s 102400M /dev/vg0/lvol1 | ./hpenc -b 16M -k 8jc38bntqehs31f3q8j4du4ry88k34ugh6eux6aoggpkbywgok9y | nc target 1234
nc -l 1234 | ./hpenc -d -k 8jc38bntqehs31f3q8j4du4ry88k34ugh6eux6aoggpkbywgok9y > /dev/vg0/lvol
hpenc
provides command line API for authenticated encryption algorithms, in particular AES-[128|256]-GCM
and
ChaCha20-Poly1305
. This tool is intended to perform bulk encryption with authentication tags for further data
transfer. Unlike openssl enc
this tool supports only modern authenticated encryption ciphers providing both
secrecy and integrity of input data.
hpenc
does not perform any asymmetric key exchange but uses pre-shared keys instead. Therefore, a user can generate
psk
on one machine:
hpenc -a chacha psk
Random key: en1o46877q7zxcen1cgbjc8qzjk1etdmsoctwyi3yy38ant7q59b
Then this key should be used for both encryption and decryption procedures, for example:
- encryption:
hpenc -a chacha -k en1o46877q7zxcen1cgbjc8qzjk1etdmsoctwyi3yy38ant7q59b < in > out
- decryption:
hpenc -d -k en1o46877q7zxcen1cgbjc8qzjk1etdmsoctwyi3yy38ant7q59b < in > out
hpenc
uses block mode of operations meaning that the input is divided into blocks of data.
By default, block's size is equal to 4096
, however, for large data portions it is advised to
increase it to several megabytes:
hpenc -a chacha -k en1o46877q7zxcen1cgbjc8qzjk1etdmsoctwyi3yy38ant7q59b -b 4M < in > out
The maximum size of a block is limited to 16 megabytes (because we need to change nonces).
Hardware used: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz (thanks to openwall for the hardware provided)
- Graph of dependency of encryption throughput on the input block size:
- Graph of dependency of encryption throughput on the number of CPU cores:
Hardware used: AMD Opteron(tm) Processor 6344, 24 physical cores, 48 logical cores, 4 NUMA nodes
- Graph of dependency of PRF throughput on the number of CPU cores:
hpenc
uses stream ciphers with authentication tag:
- aes-[128|256]-gcm - high performance cipher that sometimes could be accelerated by the hardware using
AES-NI
andPCLMULQDQ
instructions in the modern processors (such as Intel Sandybridge or newer). These ciphers are supported viaopenssl
library (openssl must be >= 1.0.1) - chacha20-poly1305 - high performance cipher that doesn't require any hardware acceleration. This cipher could be significantly faster in case of generic hardware.
Internally, hpenc
uses pre-shared key for key derivation only. For this purposes, it applies
xchacha20
stream cipher to a string of all zeroes using this pre-shared key as key and monotonically
increasing counter as nonce, the inital nonce is encoded to the message's header and is chosen randomly.
During bulk encryption hpenc
splits input into blocks of data. Each block is appended with MAC tag calculated
from the encrypted content and non-encrypted portion that represents the length of the current block. So far,
data content and length are both authenticated.
Each block is encrypted using monotonically increasing counter as nonce. This counter starts from 1
and can count up
to 2^64 (depending on ciphers) allowing thus up to 2^64 blocks being encrypted. However, hpenc
also derives new key
each 4096 blocks removing this limitation as well.
hpenc
append a small header to each stream encrypted which helps decrypter to figure out the following attributes:
- magic of the stream and version (
'hpenc' \0 \0 \1
, 8 bytes) - algorithm code (network byte order, 4 bytes)
- block length (network byte order, 4 bytes)
- initial nonce/salt for deriving keys (24 bytes)
Therefore, the overall overhead of encryption is calculated as following:
(nblocks) * (mac_length) + 16
So far, for 1Gb stream and 1Mb block size it is (16 * 1024 + 40)
~= 16K
$ ./src/hpenc -r | dieharder -g 200 -a
#=============================================================================#
# dieharder version 3.31.1 Copyright 2003 Robert G. Brown #
#=============================================================================#
rng_name |rands/second| Seed |
stdin_input_raw| 4.07e+07 |3553750306|
#=============================================================================#
test_name |ntup| tsamples |psamples| p-value |Assessment
#=============================================================================#
diehard_birthdays| 0| 100| 100|0.49770684| PASSED
diehard_operm5| 0| 1000000| 100|0.35666073| PASSED
diehard_rank_32x32| 0| 40000| 100|0.93096659| PASSED
diehard_rank_6x8| 0| 100000| 100|0.71387575| PASSED
diehard_bitstream| 0| 2097152| 100|0.94036648| PASSED
diehard_opso| 0| 2097152| 100|0.35376777| PASSED
diehard_oqso| 0| 2097152| 100|0.87245846| PASSED
diehard_dna| 0| 2097152| 100|0.73729069| PASSED
diehard_count_1s_str| 0| 256000| 100|0.25678488| PASSED
diehard_count_1s_byt| 0| 256000| 100|0.03694089| PASSED
diehard_parking_lot| 0| 12000| 100|0.96263190| PASSED
diehard_2dsphere| 2| 8000| 100|0.97821441| PASSED
diehard_3dsphere| 3| 4000| 100|0.97629855| PASSED
diehard_squeeze| 0| 100000| 100|0.87578355| PASSED
diehard_runs| 0| 100000| 100|0.06327493| PASSED
diehard_runs| 0| 100000| 100|0.58609759| PASSED
diehard_craps| 0| 200000| 100|0.26970372| PASSED
diehard_craps| 0| 200000| 100|0.71135830| PASSED
Multithreading (stream ciphers works perfectly fine for multiple threads)- PBKDF (for password based keys)
- ???
- Profit