Performance issue with OpennSSL 3 and libsrtp
murillo128 opened this issue · 9 comments
I think it is an OpenSSL issue, but crossposting it here:
I am building libsrtp using nodejs bundled OpenSSL and I have seen a performance performance issue with node 18 & 19 which bundles OpenSSL 3.0 both on AES GCM and AES CTR modes compared to node 16 which bundles OpenSSL 1.1 using the following benchmark:
https://gist.github.com/Sean-Der/7a42bd70edfe1324ccc6ab399d653c0e
The performance hit is quite significative, with AES GCM being more than 2x slower and AES CTR being 1.5x slower (depending on the server types I have tested with):
Server 1:
node v16.19.0 - openssl 1.1.1s+quic
gcm test done in 847746ms
ctr test done in 2209247ms
node v18.15.0 - openssl 3.0.8+quic
gcm test done in 2042822
ctr test done in 2837856
Server 2:
node v16.19.0 - openssl 1.1.1s+quic
gcm test done in 2210976
ctr test done in 3035863
node v18.15.0 - openssl 3.0.8+quic
gcm test done in 4259098
ctr test done in 4081153
Hi, I will try to reproduce with your test and see. It looks like it is encrypting a very small packet so the idea is mostly to measure the overhead of setting up encryption, rather than actually performing encryption, does that sound correct?
i can try later with bigger payloads
Tried with different payload sizes and still seeing the performance regression on OpenSSL 3
Node 16
0 payload size gcm done in 886714
0 payload size ctr done in 2421151
400 payload size gcm done in 1494584
400 payload size ctr done in 4381984
1200 payload size gcm done in 2326866
1200 payload size ctr done in 8568198
Node 18
0 payload size gcm done in 2007877
0 payload size ctr done in 3287084
400 payload size gcm done in 2705423
400 payload size ctr done in 5485942
1200 payload size gcm done in 3549461
1200 payload size ctr done in 9669171
Seems it is highly unlikely that openssl performance issues are solved anytime soon. Would you be prone to consider a patch supporting asm gcm functions directly? (like porting the code from go: https://go.dev/src/crypto/aes/)
@murillo128 do you mean as "built in support" like the existing AES implementation. In general I would prefer not to have to maintain and support such code but it is possible. Would need to talk with others.
OpenSSL is not the only crypto back end we support, have you looked at the alternatives? You could also provide you own backend.
There is also the possibility of replacing the current implementation at run time, see
libsrtp/crypto/include/cipher.h
Line 241 in fab73a3
I agree that maintaining the code is a burden, that's why i preferred to be in libsrtp itself hehe..
the srtp_replace_cipher_type seems very helpful and would allow me to implement the hw optimized gcm version for the chipsets i care to optimize, but keeping openssl backend for everything else. Thank you for the tip!
Follow up on this, in case it's useful to someone: we ended up extracting code from OpenSSL into a custom AES-GCM backend for libsrtp.
Our conclusion is that OpenSSL is designed for lengthy encryption ops, and not just with regards to the whole OSSL_PARAM API introduced in v3. The particular asm backend we were interested in was also made substantially slower for small messages in order to reduce its state size, so it could fit in the existing GCM128_CONTEXT
structure. We reverted this tradeoff before generating the assembly to extract into our codebase.
We considered upstreaming the backend, but given it only works for AVX-512 capable processors we decided not to.
Hi, yes AVX-512 only is probably too specific.
I guess this task needs some follow up to see if there have been any recent improvements in OpenSSL.
Will leave the issue open and hopefully loop back around soon and rerun some test.