pcengines/apu2-documentation

Linux Only Ever uses one NIC RX Queue

Closed this issue · 8 comments

On multiple APU4s I have, all on 4.11.0.6 (though I've seen the issue on other firmwares) on new kernels (at least 5.7 and 5.9, but I've seen the issue on older kernels as well), I have never managed to get packets to use the second NIC RX queue on any NIC. eg ethtool (and top's softirq usage) confirms this on all NICs, even with lots of traffic with different src/dst IP/port pairs. ethtool -N enp1s0 rx-flow-hash tcp/udp4 X doesn't change anything, and even ethtool -X enp1s0 start 1 doesn't manage to move any flows onto the second queue, despite ostensible disabling the first queue entirely.

# ethtool -S enp1s0 | grep rx_queue
     rx_queue_0_packets: 16630368449
     rx_queue_0_bytes: 6921128796303
     rx_queue_0_drops: 0
     rx_queue_0_csum_err: 3618641
     rx_queue_0_alloc_failed: 0
     rx_queue_1_packets: 0
     rx_queue_1_bytes: 0
     rx_queue_1_drops: 0
     rx_queue_1_csum_err: 0
     rx_queue_1_alloc_failed: 0

I feel your frustration, I was in the same boat when I bought my APU2c4 years ago.

Unfortunately, RSS on Intel NICs is quite the complex beast, in that, it's quite possible there isn't enough traffic of the right type for the driver to push in to the second queue. See 7.1.2.10 and buggy SKUs are not unheard of.

Unsure if it's related but you have many checksum errors, may be worth figuring out why that is.

  • What're the outputs of egrep 'CPU | enp1s0' /proc/interrupts and ethtool -u enp1s0?
  • Is the system running irqbalance?
  • Which distro and kernel is this?

Here's some stats from my NIC, which is an i210, not an i211 like yours.. just has four RSS queues instead of two.

# uname -a
OpenWrt 19.07.4 Linux apu 4.14.195 #0 SMP Sun Sep 6 16:19:39 2020 x86_64 GNU/Linux

# ethtool -S eth0 | grep rx_queue
     rx_queue_0_packets: 28051050
     rx_queue_0_bytes: 28167294565
     rx_queue_0_drops: 0
     rx_queue_0_csum_err: 0
     rx_queue_0_alloc_failed: 0
     rx_queue_1_packets: 0
     rx_queue_1_bytes: 0
     rx_queue_1_drops: 0
     rx_queue_1_csum_err: 0
     rx_queue_1_alloc_failed: 0
     rx_queue_2_packets: 0
     rx_queue_2_bytes: 0
     rx_queue_2_drops: 0
     rx_queue_2_csum_err: 0
     rx_queue_2_alloc_failed: 0
     rx_queue_3_packets: 0
     rx_queue_3_bytes: 0
     rx_queue_3_drops: 0
     rx_queue_3_csum_err: 0
     rx_queue_3_alloc_failed: 0

# egrep 'CPU|eth0' /proc/interrupts
            CPU0       CPU1       CPU2       CPU3       
  36:        124          0          0          1   PCI-MSI 1048576-edge      eth0
  37:        287        271        322   26699895   PCI-MSI 1048577-edge      eth0-TxRx-0
  38:         28         36         30    5207998   PCI-MSI 1048578-edge      eth0-TxRx-1
  39:        182    5432009        179        258   PCI-MSI 1048579-edge      eth0-TxRx-2
  40:         17         12     546360         13   PCI-MSI 1048580-edge      eth0-TxRx-3

You could try to alleviate the situation by configuring ntuple filtering to steer network flows.

This blog post is a gold mine for tweaking NIC driver networking on linux, with relevant examples for the igb driver.
It sheds light on just how complicated the Linux networking stack is, and why it is impossible to monitor, or tune it, without understanding at a deep level exactly what’s going on.

The igb driver has a bug in it where it does not currently enable RSS for the Intel i211 (2 RSS queues supported), but does for the Intel i210 (4 RSS queues supported).

I have a patch queued up to correct this which will fold in to the 5.12 Linux kernel, I will then after seek to get this back ported to the 5.4 and 5.10 LTS kernels:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6e6026f2dd2005844fb35c3911e8083c09952c6c

APU2C4 (BIOS v4.13.0.2) here running Debian Bullseye (Testing) with Kernel 5.10.9-1:

root@apu2:~# ethtool -S eth0 | grep rx_queue_
     rx_queue_0_packets: 6514792
     rx_queue_0_bytes: 1545732584
     rx_queue_0_drops: 0
     rx_queue_0_csum_err: 0
     rx_queue_0_alloc_failed: 0
     rx_queue_1_packets: 6232786
     rx_queue_1_bytes: 1130680901
     rx_queue_1_drops: 0
     rx_queue_1_csum_err: 0
     rx_queue_1_alloc_failed: 0
     rx_queue_2_packets: 6281831
     rx_queue_2_bytes: 1096616674
     rx_queue_2_drops: 0
     rx_queue_2_csum_err: 0
     rx_queue_2_alloc_failed: 0
     rx_queue_3_packets: 4802969
     rx_queue_3_bytes: 667748912
     rx_queue_3_drops: 0
     rx_queue_3_csum_err: 0
     rx_queue_3_alloc_failed: 0

Right, for I210 (APU2C), there's 4 queues and the bug does not impact it. For most APU models (i211), we have to wait for Nick's patch to make its way usptream and then be backported. Gonna close this as its not a BIOS issue and the fix was found.

The igb driver has a bug in it where it does not currently enable RSS for the Intel i211 (2 RSS queues supported), but does for the Intel i210 (4 RSS queues supported).

I have a patch queued up to correct this which will fold in to the 5.12 Linux kernel, I will then after seek to get this back ported to the 5.4 and 5.10 LTS kernels:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6e6026f2dd2005844fb35c3911e8083c09952c6c

Does this igb RSS queue bug affect Freebsd and its downstream clones as well?

Does this igb RSS queue bug affect Freebsd and its downstream clones as well?

No, not this specific one.

Of course it's also in the 5.12 kernel, and it's also backported in to 5.11.19 for which the branch will go EOL soon, so probably of little interest.