Linaro/uadk

test_hisi_hpre sing async mode performance is not stable

Closed this issue · 3 comments

//async
test_hisi_hpre rsa-sgn --mode=crt --perf --trd_mode=async -t 1 --key_bits=2048

//sync
test_hisi_hpre rsa-sgn --mode=crt --perf --trd_mode=sync --key_bits=2048

sync:
23:<< sync 2 thread rsa-gen crt mode 2048 key_bits at 20899.988 ops!
45:<< sync 2 thread rsa-gen crt mode 2048 key_bits at 18341.109 ops!
67:<< sync 2 thread rsa-gen crt mode 2048 key_bits at 21087.043 ops!
89:<< sync 2 thread rsa-gen crt mode 2048 key_bits at 21136.064 ops!
111:<< sync 2 thread rsa-gen crt mode 2048 key_bits at 20772.770 ops!

async:
22:<< async 1 thread rsa-gen crt mode 2048 key_bits at 3570.233 ops!
43:<< async 1 thread rsa-gen crt mode 2048 key_bits at 58572.301 ops!
64:<< async 1 thread rsa-gen crt mode 2048 key_bits at 156542.062 ops!
85:<< async 1 thread rsa-gen crt mode 2048 key_bits at 4922.516 ops!
106:<< async 1 thread rsa-gen crt mode 2048 key_bits at 27717.783 ops!

async+bind:
numactl --cpubind=0 --membind=0 test_hisi_hpre rsa-sgn --mode=crt --perf --trd_mode=async -t 1 --key_bits=2048
linaro@ubuntu:~/test$ grep -rn "async 1 thread rsa-gen crt" log.2
22:<< async 1 thread rsa-gen crt mode 2048 key_bits at 6391.080 ops!
43:<< async 1 thread rsa-gen crt mode 2048 key_bits at 10102.447 ops!
64:<< async 1 thread rsa-gen crt mode 2048 key_bits at 18771.873 ops!
85:<< async 1 thread rsa-gen crt mode 2048 key_bits at 11188.066 ops!
106:<< async 1 thread rsa-gen crt mode 2048 key_bits at 8320.621 ops!
127:<< async 1 thread rsa-gen crt mode 2048 key_bits at 19878.707 ops!
148:<< async 1 thread rsa-gen crt mode 2048 key_bits at 6042.568 ops!
169:<< async 1 thread rsa-gen crt mode 2048 key_bits at 7707.399 ops!
190:<< async 1 thread rsa-gen crt mode 2048 key_bits at 1772.593 ops!
211:<< async 1 thread rsa-gen crt mode 2048 key_bits at 5408.329 ops!
232:<< async 1 thread rsa-gen crt mode 2048 key_bits at 4921.700 ops!
253:<< async 1 thread rsa-gen crt mode 2048 key_bits at 19461.562 ops!
274:<< async 1 thread rsa-gen crt mode 2048 key_bits at 17051.154 ops!
295:<< async 1 thread rsa-gen crt mode 2048 key_bits at 17449.238 ops!
316:<< async 1 thread rsa-gen crt mode 2048 key_bits at 4229.448 ops!
337:<< async 1 thread rsa-gen crt mode 2048 key_bits at 17965.992 ops!
358:<< async 1 thread rsa-gen crt mode 2048 key_bits at 19314.020 ops!
379:<< async 1 thread rsa-gen crt mode 2048 key_bits at 18230.029 ops!
400:<< async 1 thread rsa-gen crt mode 2048 key_bits at 4987.757 ops!
421:<< async 1 thread rsa-gen crt mode 2048 key_bits at 17950.393 ops!
442:<< async 1 thread rsa-gen crt mode 2048 key_bits at 18874.283 ops!
463:<< async 1 thread rsa-gen crt mode 2048 key_bits at 3388.029 ops!
484:<< async 1 thread rsa-gen crt mode 2048 key_bits at 120603.016 ops!
505:<< async 1 thread rsa-gen crt mode 2048 key_bits at 16211.062 ops!
526:<< async 1 thread rsa-gen crt mode 2048 key_bits at 21487.039 ops!
547:<< async 1 thread rsa-gen crt mode 2048 key_bits at 58381.984 ops!
568:<< async 1 thread rsa-gen crt mode 2048 key_bits at 10157.367 ops!
589:<< async 1 thread rsa-gen crt mode 2048 key_bits at 15453.960 ops!
610:<< async 1 thread rsa-gen crt mode 2048 key_bits at 5289.558 ops!
631:<< async 1 thread rsa-gen crt mode 2048 key_bits at 20218.229 ops!

Please also add versions of kernel, uadk, openssl-uadk. and hardware information like numa info

The root cause is poll thread using wd_rsa_poll_ctx trying to get any package,
which call driver->recv keep retying since -EAGAIN.
But recv fetch spin_lock, so send is impacted since competing spin_lock.

How to solve:
1, use semaphore, only calling wd_rsa_poll_ctx after send.
This already exists in openssl-uadk, but uadk test application does not have, which we do not consider now.

  1. Even after send, there is still some time required before receive.
    Move spin_lock from rsa to qm, make it smaller.

commit fa9ef24
Author: Zhangfei Gao zhangfei.gao@linaro.org
Date: Mon Apr 19 09:35:46 2021 +0000

wd_rsa: remove spin_lock since qm already has lock

In async mode, polling thread call wd_rsa_poll_ctx, which keeps
trying driver->recv, but always fail with -WD_EAGAIN, performance
may impacted if lock is hold at this time.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>

commit 48d8d16
Author: Haojian Zhuang haojian.zhuang@linaro.org
Date: Thu Apr 1 15:12:52 2021 +0800

hisi: qm: fix the lock in receiving operation

The lock actually need to protect more resources. Since there's another
lock in algorithm for synchronization mode, these resources are always
protected by lock in algorithm. Now fix it.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>