Pi3B wifi brcmf_sdio_hdparse
Closed this issue · 32 comments
@pelwell Phil, is this expected????? I've got pages and pages of them? (Not something I've seen before with bcmfmac driver, and I've used it before with several other SBC's.)
brcmfmac: brcmf_sdio_hdparse: seq 106: sequence number error, expect 108
brcmfmac: brcmf_sdio_hdparse: seq 109: sequence number error, expect 108
brcmfmac: brcmf_sdio_hdparse: seq 110: sequence number error, expect 109
Yes, I've seen those. My theory is that they are caused by out-of-order packet arrival, and that they are related to a low-power mode of the 43430. Try opening a connection (e.g. ssh) with a keepalive of less than 30 seconds and see if the messages still appear.
I'm not convinced it has anything to do with power management. I'm seeing them during heavy traffic.
Mar 01 20:47:28 brcmfmac: brcmf_sdio_hdparse: seq 249: sequence number error, expect 248
Mar 01 20:47:28 brcmfmac: brcmf_sdio_hdparse: seq 248: sequence number error, expect 250
Mar 01 20:47:28 brcmfmac: brcmf_sdio_hdparse: seq 250: sequence number error, expect 249
I think this may need reporting back to Broadcom. I'm not sure what they are responsible for..... Just hardware and firmware..... Or driver as well?
I also think this "out of sequence" issue, may have some bearing on the crappy peak throughput numbers, less then 30Mbps. I've gotten used to 100Mbps peak throughput from the "cheap as chips" mt7601u dongles. Whilst I appreciate that my use case(s), wouldn't be classed as "educational", (streaming high-res audio), and although I don't have hard numbers in front of me for them, IIRC, Wandboard and Cubietruck, which both use earlier revisions of bcm43xx hardware, were capable of peak throughput > 40Mbps.
Anyway, I've tested to the point where it's now time to decide that the on-board wi-fi "feature" isn't suitable for my use case(s), blacklist the brcmfmac module and use the dongles I was using with previous Pi hardware.
And not that I believe this has any bearing on this, but it is curious......
Raspbian......
Firmware version = wl0: Dec 15 2015 18:10:45 version 7.45.41.23 (r606571) FWID 01-cc4eda9c
The Ubuntu MATE image is shipping ....
Firmware version = wl0: Sep 4 2015 16:45:22 version 7.10.323.40 (r523180) FWID 01-aaca02c8
I'm not sure where they both got the firmware..... Assume it must have been directly from PiF. (It doesn't appear to be in linux-firmware repo. Which is a problem in itself, for other distributions that wont ship firmware unless it has been accepted into linux-firmware.......)
I'm seeing the same problem.
I did tests with the older firmware (Sep 4 2015 16:45:22), problems are worse with it.
But I still get those errors with the Dec 15 2015 18:10:45 -version.
Just FYI, I'm also seeing this periodically with a very simple Arch linux setup after setting up WiFi.
In my case, I'm not streaming anything at all and have almost nothing running at the time (it was idling over night). I'm not sure what services might actually have tried to do anything at all. Perhaps the time sync?.
Broadcom are aware of the issue. They believe that it is a threading problem within the driver, and they are working on a fix.
The apparent link with power saving was just coincidental - I have also seen those messages since the patch, although they are still rare for me.
Not so rare, used as a music streamer with constant wi-fi traffic, continuously playing back 44k1/16/2 media....
$ journalctl -b | grep brcm > journal_AKK3B2_brcm_20160320.txt
Glad to see this isn't just me. :) Ubuntu Mate w/a kernel built the day before yesterday... interesting re the different firmware versions. Is there some place where Broadcom posts the latest version?
If it helps, I saw this today accompanied by a load of traces in dmesg:
89265.310174] ------------[ cut here ]------------
[89265.405363] WARNING: CPU: 1 PID: 616 at drivers/net/wireless/brcm80211/brcmfmac/core.c:1144 brcmf_netdev_wait_pend8021x+0xfc/0x108 [brcmfmac]()
[89265.663542] Modules linked in: tcp_diag inet_diag nfsd nls_utf8 ntfs 8021q garp bridge stp llc dm_mirror dm_region_hash dm_log dm_mod bcm2835_rng bcm2835_gpiomem uio_pdrv_genirq uio sch_fq_codel brcmfmac brcmutil cfg80211 rfkill ip_tables x_tables ipv6
[89266.108656] CPU: 1 PID: 616 Comm: hostapd Tainted: G W 4.1.20-v7+ #867
[89266.261972] Hardware name: BCM2709
[89266.329332] [<800185e0>] (unwind_backtrace) from [<80013f48>] (show_stack+0x20/0x24)
[89266.484582] [<80013f48>] (show_stack) from [<80572ddc>] (dump_stack+0xd4/0x118)
[89266.627843] [<80572ddc>] (dump_stack) from [<800271b4>] (warn_slowpath_common+0x98/0xc8)
[89266.793234] [<800271b4>] (warn_slowpath_common) from [<800272a0>] (warn_slowpath_null+0x2c/0x34)
[89266.972175] [<800272a0>] (warn_slowpath_null) from [<7f1316bc>] (brcmf_netdev_wait_pend8021x+0xfc/0x108 [brcmfmac])
[89267.181483] [<7f1316bc>] (brcmf_netdev_wait_pend8021x [brcmfmac]) from [<7f120878>] (send_key_to_dongle+0xa4/0xf8 [brcmfmac])
[89267.409303] [<7f120878>] (send_key_to_dongle [brcmfmac]) from [<7f120ac4>] (brcmf_cfg80211_del_key+0x68/0x78 [brcmfmac])
[89267.628723] [<7f120ac4>] (brcmf_cfg80211_del_key [brcmfmac]) from [<7f0a5558>] (nl80211_del_key+0xfc/0x28c [cfg80211])
[89267.843046] [<7f0a5558>] (nl80211_del_key [cfg80211]) from [<804c839c>] (genl_rcv_msg+0x26c/0x3ec)
[89268.025274] [<804c839c>] (genl_rcv_msg) from [<804c7584>] (netlink_rcv_skb+0xb0/0xcc)
[89268.185586] [<804c7584>] (netlink_rcv_skb) from [<804c8120>] (genl_rcv+0x34/0x44)
[89268.339165] [<804c8120>] (genl_rcv) from [<804c6ec8>] (netlink_unicast+0x180/0x244)
[89268.496129] [<804c6ec8>] (netlink_unicast) from [<804c7360>] (netlink_sendmsg+0x30c/0x378)
[89268.664916] [<804c7360>] (netlink_sendmsg) from [<8047d528>] (sock_sendmsg+0x24/0x34)
[89268.825244] [<8047d528>] (sock_sendmsg) from [<8047dfec>] (___sys_sendmsg+0x1dc/0x1e4)
[89268.987259] [<8047dfec>] (___sys_sendmsg) from [<8047ed44>] (__sys_sendmsg+0x4c/0x7c)
[89269.147589] [<8047ed44>] (__sys_sendmsg) from [<8047ed8c>] (SyS_sendmsg+0x18/0x1c)
[89269.299551] [<8047ed8c>] (SyS_sendmsg) from [<8000fa20>] (ret_fast_syscall+0x0/0x54)
[89269.319065] ---[ end trace 5755d7b4b3de9e3d ]---
Am using the RP3 as a headless access point (you can see hostapd mentioned in the trace) running FC23 setup as described here:
http://hobo.house/2016/03/13/installing-fedora-linux-on-the-raspberry-pi-3/
I am also using a RP3 as a wifi access point, Kernel 4.1.10-v7+, and am seeing these messages as well in the kernel log. Out of order or not, I haven't noticed anything using the access point on a plus note.
For me, this seems to be associated with a crashof the wifi stuff so that wlan0 gets disconnected and I have to do a
sudo systemctl restart networking
to get it to come alive again.
Attached is stuff from syslog.
syslog.txt
At 09:42:32, wlan0 is alive enough to hear a dhcp renewal. At 09:46:58, we get the seq error, and wlan0 dies and doesn't come back.
Still happens with 4.4.7
perhaps resort packets in kernel?
I was getting these crashes so the pi 3 would not run for as much as a day.
I added heat sinks (and perhaps updated some software, although I don't know exactly what) and they've stayed alive for more than 2 days now.
At the same time I added heat sinks, I added a cron thingie that would check every 2 minutes and appropriately kick the net if it found the wifi down. I has not needed to do a kick since the heat sinks.
I wonder if this is related to issue #1471? ("Memory leak in pi3 wifi driver?")
brcmf_sdio_hdparse is called in drivers/net/wireless/brcm80211/brcmfmac/sdio.c line 1952 when the reorder problem occurs.
Value of rxleft (see line 1918) is ~50 - so it should be possible to fix the problem.
Just check the next packet and swap them if its sequence number is just one off.
happens on 4.4.11-v7+ 888 SMP continuously:
seq#_err_20160530.txt
Just installed Mate 16.04 last night, seeing this problem.
Firmware version = wl0: Dec 15 2015 18:10:45 version 7.45.41.23 (r606571) FWID 01-cc4eda9c
I also see these error messages:
Kernel: Linux rpi3 4.4.11-v7+ #888 SMP
Firmware:version faf071dd4885c5ac1a89483d35a5326e7f69495f (clean) (release)
Broadcom have released a patch that changes the error messages to debug message, with the following commit message:
brcmfmac: change rx_seq check log from error print to debug print
The bus rx sequence is not in order because that control and event
frames always cause immediate send, but data frames may be held
for glomming in firmware side. It is not actually an error as the
packets are still processed even if the RX sequence is not in order.
Therefore the error message is rephrased and changed to a debug
message.
I've applied this patch to rpi-4.4.y.
broom... carpet... lift... sweep.... ;)
It might not actually be an error, as far as they are concerned, but from what I experienced, the throughput goes down the toilet as soon as those out of sequence messages start appearing..... How do they explain that? I suspect the fact that the logging is "fixed", isn't going to fix the throughput issue. (Not that it affects me. I stopped using the Pi3B on-board wifi some time ago.)
@koppi Well, this latest gen Broadcom 43 series chipset that the Pi3B uses, does seem to have throughput issues compared to the earlier generation BRCM 43 series chips that I have come across on other SBC's. Even though you cant expect class leading performance, (it's 'n', but 20MHz single channel 'n'), from the testing I did back in March when the Pi3B was launched, against on-board BRCM 43xxx wifi with Wandboard and Cubietruck, the Pi3B's average wifi throughput was under half the WB and CT with the best conditions, (ie. only client of the AP), but embarrassingly bad once the "out of sequence" logging started, to the point where I couldn't even stream 96/24 audio without continual stuttering. Easily "fixed" with a $2 MT7601U dongle. ;)
YMMV. I suspect many users will be quite happy that "free" wifi (without the need to purchase a dongle) is available for general and light use, reading email or web browsing. For high-res audio playback, you'd probably have more success trying to suck doughnuts up a drinking straw!
@clivem ok, thank's for your explanation – good to see, that I'm not the only one who ran into this issue. I too bought an MT7601U dongle as a work-around.
I see a new firmware is loaded, but for the problem in this thread there is no improvement.
20160607_00:47H.txt
What have you done to update? The new BCM43430 firmware will have no effect on these messages, but a new kernel and associated modules will.
I just updated my system with apt-get update && apt-get upgrade. It upgraded firmware-brcm80211 to 0.43+rpi4.
At boot it loaded Firmware version = wl0: May 27 2016 00:13:38 version 7.45.41.26 (r640327) FWID 01-df77e4a7.
Should I run rpi-udate?
No - we haven't built an RPi firmware with the new commits in yet. It should happen in the next few days.
broom... carpet... lift... sweep....