Mellanox/k8s-rdma-shared-dev-plugin

Couldn't get device attributes

Opened this issue · 0 comments

I tried to deploy the rdma device plugin in HCA mode in my kubernetes cluster. I followed the instruction and the device plugin can be registered successfully. If I run "kubectl describe node [node_name]", I can find the rdma/hca resource. If I run "ibstat" in the pods, the inifiniband information shows up and the status is active/up.

However, when I tried to run a connection test using "ib_read_bw", it threw me following error: "Couldn't get device attribute.
Unable to create QP.
Failed to create QP.
Couldn't create IB resource."

I simply run the test by running "ib_read_bw" in one pod and running "ib_read_bw [target_pod_ip_addr]" in another pod. Could anyone please help with this issue? I appreciate your help.