k8snetworkplumbingwg/sriov-cni

DeviceId, min_tx_rate, max_tx_rate in the NAD not taking effect.

shankar-bala opened this issue · 8 comments

What issue would you like to bring attention to?

DeviceId, min_tx_rate, max_tx_rate in the NAD not taking effect.
I have a NAD and pod with the following spec. My expectation was that the pod should be assigned to the VF that pertains to the deviceID given in the NAD. But i see the pod comes up with an interface with a random VF. Similarly i dont see the max_tx_rate configs taking effect either.

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: sriov-net2
annotations:
k8s.v1.cni.cncf.io/resourceName: intel.com/intel_sriov_netdevice
spec:
config: '{
"type": "sriov",
"cniVersion": "0.3.1",
"name": "sriovi-net",
"vlan": 2000,
"mac": "BA:FE:C0:FF:EE:00",
"link_state": "enable",
"max-tx-rate": 200,
"trust": "on",
"ipam": {
"type": "whereabouts",
"range": "10.1.1.1/24",
"gateway": "10.1.1.254"
}
}'

apiVersion: v1
kind: Pod
metadata:
name: testpod2
annotations:
k8s.v1.cni.cncf.io/networks: sriov-net2
spec:
containers:

name: appcntr1
image: openshift/hello-openshift
imagePullPolicy: IfNotPresent
resources:
requests:
intel.com/intel_sriov_netdevice: '1'
limits:
intel.com/intel_sriov_netdevice: '1'

What is the impact of this issue?

Not able to assign a pod to a particular VF

Do you have a proposed response or remediation for the issue?

no

@shankar-bala The device ID is randomly chosen by kubelet if you have multiple devices in the resource Pool, so this is the expected behavior if sriov-network-device-plugin or sriov-network-operator is used in your deployment.

Re max tx rate, the json string for it would be max_tx_rate, you may want to try replace dash with underscore in the NAD.

Thanks a lot for the comments. Appreciate it...

One more quick clarification, in what cases does the deviceID work then ?

One more quick clarification, in what cases does the deviceID work then ?

In the case that deviceID (pci address of the VF) is specified in the NAD and resource request is not added in the pod manifest. btw, this is not a recommended way since it doesn't scale.

Awesome.. thanks, yes removing the resources worked. The max_tx_rate seems to work. But min_tx_rate does not. It errors.

SRIOV-CNI failed to configure VF "failed to set vf 0 min_tx_rate to 200 Mbps: max_tx_rate to 0 Mbps: invalid argument"

Awesome.. thanks, yes removing the resources worked. The max_tx_rate seems to work. But min_tx_rate does not. It errors.

SRIOV-CNI failed to configure VF "failed to set vf 0 min_tx_rate to 200 Mbps: max_tx_rate to 0 Mbps: invalid argument"

Some Intel cards don't support min_tx_rate afaik.

@shankar-bala as @zshi-redhat point out intel cards don't support min_tx_rate this is not something the operator can't handle it depends on the driver.

will it be ok to close this issue?

no answer for 12 days closing this issue.

if you need more help, please feel free to reopen the issue