ofiwg/librdmacm

limitation observed on the number of QPs that a single user-space process can create

sunmingbao opened this issue · 1 comments

Hi,
We are using librdmacm.so.1.1.16.9, which provides RDMA APIs (e.g., rdma_create_ep, rdma_listen, rdma_connect) for user-space apps.

However, for a single user-space process using this lib, we found that only 339 QPs can be created.
So could you help to confirm if there is a limitation on the number of QPs that a single user-space process can create using this lib?

Following is some of the logs from our test (performed on a VM with the rdma_rxe driver):

System info:
linux-15v9:~ # cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12:sp4"

linux-15v9:~ # uname -r
4.12.14-94.41-default

linux-15v9:~ # lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 142
Model name: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz
Stepping: 12
CPU MHz: 2112.002
BogoMIPS: 4224.00
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-3
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xsaves arat flush_l1d arch_capabilities

Server side:
linux-15v9:~ # ib_perf.exe --server-ip 192.168.17.150 --server-port 10001 -s --qp-num 1024
qp [0] local 192.168.17.150:10001 peer 192.168.17.150:51705 created.
qp [1] local 192.168.17.150:10001 peer 192.168.17.150:37190 created.
......
qp [337] local 192.168.17.150:10001 peer 192.168.17.150:43572 created.
qp [338] local 192.168.17.150:10001 peer 192.168.17.150:52580 created.

Client side:
linux-15v9:/usr/lib64 # ib_perf.exe --server-ip 192.168.17.150 --server-port 10001 -c --qp-num 1024
qp [0] local 192.168.17.150:51705 peer 192.168.17.150:10001 created.
qp [1] local 192.168.17.150:37190 peer 192.168.17.150:10001 created.
......
qp [337] local 192.168.17.150:43572 peer 192.168.17.150:10001 created.
qp [338] local 192.168.17.150:52580 peer 192.168.17.150:10001 created.
ERR_DBG:/mnt/linux-dev-framework-master/apps/ib_perf/perf_frmwk.c(599)-create_connections_client:
rdma_create_ep failed: Cannot allocate memory

This question is better directed to the linux-rdma mailing list.

The limit may be a result of ulimit values restricting how much memory may be registered.