Fail to set dcb on
Closed this issue · 11 comments
Hello
Can I ask what driver is running on enp1s0?
I was able to reproduce this issue on an ixgbe device but I think this behavior is caused by the driver. Even if you aren't using ixgbe you might be having a similar issue. The flow I see is something like this:
- dcbtool enables the dcb feature--this sets the IXGBE_FLAG_DCB_ENABLED flag in the driver.
- lldpad eventually calls DCB_CMD_SDCBX which runs ixgbe_set_dcbx(). Currently the interface is still in IEEE mode which means ixgbe will attempt to set an ETS configuration.
- ixgbe_dcbnl_ieee_setets() attempts to apply the dcb configuration via ixgbe_setup_tc()
- ixgbe_setup_tc() will notice that there are zero active tcs since a dcb map hasn't been requested thus it clear the IXGBE_FLAG_DCB_ENABLED flag
- when the user runs dcbtool gc $if dcb the state is now off
Are you attempting to negotiate a DCB map with a CEE link partner?
Hi sregister,
The driver running on enp1s0 is ixgbe 5.6.5.
The output of lldptool -t -i enp1s0 -V IEEE-DCBX -c mode
is mode=auto, which DCBX mode should be IEEE DCBX. However, both ends of the link have DCBX Version: CEE.
I also post this problem to intel forum, where you can get more information.
If you need additional info, please tell me.
If you'd like to use CEE DCBx then you can do this:
On peer force CEE mode and restart lldpad to apply the change
dcbtool sc dcbx v:force-cee
systemctl restart lldpad
On the local and peer machine inspect:
lldptool -tni $iface
lldptool -ti $iface
dcbtool gc $iface dcb
Both interfaces should be sending/receiving CEE TLVs and dcb state should be on. Currently both machines are in willing mode--to fix this set the peer to advertise a non-willing dcb configuration:
dcbtool sc $peer_iface pg e:1 w:0 pgid:00010000 pgpct:25,75,0,0,0,0,0,0
dcbtool sc $peer_iface pfc e:1 w:0 pfcup:00010000
The both devices should have applied the requested configuration via CEE-DCBx:
# dcbtool gc $local_iface dcb
Command: Get Config
Feature: DCB State
Port: ens16
Status: Successful
DCB State: on
# dcbtool go $local_iface pg
Command: Get Oper
Feature: Priority Groups
Port: ens16
Status: Successful
Oper Version: 0
Max Version: 0
Errors: 0x00 - none
Oper Mode: true
Syncd: true
up2tc: 0 0 0 1 0 0 0 0
pgpct: 25% 75% 0% 0% 0% 0% 0% 0%
pgid: 0 0 0 1 0 0 0 0
uppct: 100% 100% 100% 100% 100% 100% 100% 100%
pg strict: 0 0 0 0 0 0 0 0
If you still are having issues stop lldpad, clear /var/lib/lldpad/lldpad.conf and try again. If that doesn't work maybe try with PR: #41
Hi sregister,
Thank you so much for your helpful solution 👍
To clarify the problem, I would like to learn more:
(1) In step 4, is it right that dcb map hasn't been requested is due to having a CEE link partner, and I have to negotiate with it first? What if the link partner is in IEEE DCBX mode, how to have a dcb map?
(2) On the local machine, the result of dcbtool gc dcbx
was in CEE mode without any setting.
Why not IEEE DCBX, which is the default DCBX version?
And, Why do we need change to force CEE mode?
Look forward to your reply! Thx!
CEE has control and feature state machines to ensure symmetric configuration before the map becomes operational. IEEE does not have such strict requirements; each side advertises its active configuration, willing/non-willing status, and a recommendation for the peer. Each side works out what it should do but doesn't require the peer is in sync.
IEEE map can be applied using the examples here:
https://github.com/intel/openlldp/blob/master/docs/lldptool-ets.8#L121
The transition between IEEE(default)->CEE(legacy) is described here: https://github.com/intel/openlldp/blob/master/docs/lldptool-dcbx.8#L26
Good DCBx version comparisons can be found online--hope this solves the problem.
Some things to check:
- Are you using a tagged vlan? The priority is carried in the vlan header and if gets dropped then the receiver won't know how to classify RX traffic.
- Are you sending priority tagged traffic? Run tcpdump/wireshark on the receiver to confirm you see the correct priority in the vlan header. To make iperf send tagged traffic you need to use net_prio cgroups:
cd /sys/fs/cgroup/net_prio
mkdir pri3
cd pri3
echo 'ens4f0 3' > net_prio.ifpriomap
echo 'ens4f0.vlan3260 3' > net_prio.ifpriomap
echo "$pid_of_iperf" > tasks
the iperf process should now be sending traffic with 3 in the priority field in the vlan header.
3) Confirm that pfc is enabled for the priority you are sending/receiving on. dcbtool go $iface pfc
Hi sreg,
Thank you so much. The problem is exactly caused by vlan tag. I wonder
-
echo '$iface.vlan3260 3' > net_prio.ifpriomap
returnwrite error: No such device
echo "$pid_of_iperf" > tasks
, how to know pid of iperf? Its pid changes each time I execute it. -
I followed method in readme and set
tc filter add dev enp1s0 protocol ip parent 1: u32 match ip dport 5003 0xffff action skbedit queue_mapping 3
, which readme claimed that the outgoing frames will be tagged with the corresponding 802.1p priority value. However, tcpdump showed the priority in vlan tag was still 0. Thus, I triedtc filter add dev enp1s0 protocol ip parent 1: u32 match ip dport 5003 0xffff action skbedit queue_mapping 3 priority 3
, and this really set the priority in vlan tag to 3.
I have another problem.
Is it right that we can't use dpdk packet generate application in dcb mode, which use igb_uo driver?
The result of dcbtool shows "Device no capable".
I'm not sure how to get a bifurcated setup working with DPDK. Some of the registers will not be available to manipulate from the kernel driver side, and I guess that might include the DCB information.
I'm closing this now. Please open a new issue in case you have other problems. Thanks for the reports!