Koenkk/zigbee2mqtt

CH341 driver bugged in recent Linux kernel versions

Koenkk opened this issue ยท 38 comments

Recent Linux kernel versions contain a bug in the CH341 driver. The CH341 controller is used on various coordinators, e.g. the zzh. This causes strange behavior, e.g. delay when receiving messages from devices.

How do I know I'm affected?

Your coordinator uses a CH340/CH341 controller (e.g. zzh) AND you are running one of the kernel following kernel version (run uname -r to check which):

  • 5.13.*: 5.13.10 till 5.13.13 are affected, fixed in 5.13.14
  • 5.10.*: 5.10.58 till 5.10.61 are affected, fixed in 5.10.62
  • 5.4.*: 5.4.140 till 5.4.143 are affected, fixed in 5.4.144
  • Ubuntu Focal: 5.4.0-85 till 5.4.0-89 are affected (fixed in 5.4.0-90)
  • I did not check other kernel versions

Workaround

  • Downgrade your kernel version
  • For hassos users, this will be fixed in 6.4, for now dowgrande to 6.2 using: ha os update --version 6.2

Related issues:

Thanks ๐Ÿ’ฏ it`s saves my network ;) I was made downgrade to OS v6.2

HASS OS 6.4 has been released? Is it still affected with this bug?

no, will be fixed.

no, will be fixed.

only for information , do you know more or less the time windows ?
to understand if it is more like 1 months or 1 week , for example..

I know, this in not up to you but maybe you know something more :-)
wbr
Tz

bruvv commented

Hassos 6.4 has been released and this is now fixed.

@Koenkk are you sure, that only CH341 is affected? I have Slash's stick with a CP2102 and it behaved "sluggish" with the recent kernel (5.10.60) on a Raspberry PI 3B. E.g. OTA updates always failed. Sonoff ZBMINI switched with noticeable delay, especially when controlled within a group (broadcast).
I downgraded the Kernel to 5.10.52, and OTA updates work now. ZBMINI behaves better.
Just a coincidence, or a related problem?

@okastl CP2102 uses a different driver so its not affected by the CH341 driver bug for sure (but maybe the CP2102 also has a bug, will keep this in mind for slaesh users)

Quick update on this, I've just updated the kernel in my NanoPi M4V2 to 5.10.63 and the problem is not happening anymore, thus confirming that this was indeed a kernel issue.

I just updated to 6.4 and Zigbee2MQTT wasn't working anymore at all...

I just updated to 6.4 and Zigbee2MQTT wasn't working anymore at all...

Which adapter are you using?

I can confirm on my raspberry pi 3 that 5.10.63 is still buggy. Issue was resolved when i downgraded all the way to 5.10.52.

rpi-update from commit b4e395b3e87dba4964f314e12871630cabb35f70 fixed the issue (5.10.63-v7+), I'm using zzh! CC2652R with rpi3 B+

Ah, and I was wondering what was going on when I upgraded to OpenWrt 21.02 (I'm using zzh! over TCP using ser2net) - suddenly messages from Xiaomi switches (and perhaps some other sensors) wouldn't be delivered instantly, but only with another message from a different device. OpenWrt 21.02.0 has kernel 5.4.143. Let's hope 21.02.1 is published soon, it will have at least 5.4.145.

I found same issue with latest rpi kernel so I tried to use the latest Linux drivers from vendor.

https://github.com/daverobertson63/CH341SER

My solution was to replace the current driver ch341.ko with ch34x.Ko driver. I only tested on rpi and udoo quad - running Ubuntu 14.

@daverobertson63 thanks, seems that this bug also got in the latest ubuntu 20.04 kernel (update OP with version numbers). With your driver everything works perfectly!

On hassio 6.4 the problem is solved, tested will all aqara sensors.
Stick zzh!

@Koenkk Hi - I am having the same issue as described above with my Xiaomi motion sensors and Xiaomi wireless switch/button. I am running z2m on docker and have the latest version installed. I got the CC2652R and everything has been working fine until the recent update.

I am running my system on Ubuntu and my kernel version is : 4.15.0-159-generic. I got a philips hue motion sensor to replace my xiaomi one, and I also experience delays with the hue motion sensor.

Can you please help?

working fine until the recent update.

Recent update of what? If youpgraded zigbee2mqtt, have you tried downgrading back to the previous version?

I am running my system on Ubuntu and my kernel version is : 4.15.0-159-generic.

What version of Ubuntu is that? That's an ancient kernel version.

Anyway, tagging the author is not very nice, you're commenting on an issue that is not caused by zigbee2mqtt. Please reconsider next time :)

I found same issue with latest rpi kernel so I tried to use the latest Linux drivers from vendor.

https://github.com/daverobertson63/CH341SER

My solution was to replace the current driver ch341.ko with ch34x.Ko driver. I only tested on rpi and udoo quad - running Ubuntu 14.

Hi @daverobertson63 how do I update the driver in Ubuntu? things are not working for me and I really need to resolve this issue. Can you please help? thanks

I found same issue with latest rpi kernel so I tried to use the latest Linux drivers from vendor.
https://github.com/daverobertson63/CH341SER
My solution was to replace the current driver ch341.ko with ch34x.Ko driver. I only tested on rpi and udoo quad - running Ubuntu 14.

Hi @daverobertson63 how do I update the driver in Ubuntu? things are not working for me and I really need to resolve this issue. Can you please help? thanks

I followed this to get the headers onto the Linux machine

https://www.tecmint.com/install-kernel-headers-in-ubuntu-and-debian/

Thats the most important part - when you do a make and sudo make load in the driver code - it should just work - if it fails on the make - you probably dont have the headers or gcc installed

Then just follow the instructions. I noted that some drivers are gzip and some not - you can figure that out by looking at whats there.

Make sure you do the depmod - as that creates the kernal boot map for the devices - its more or less the same I did for the pi. I only tested in Ubuntu 14 - so things may have changed -but the make and sudo make load should work on pretty much anything

@daverobertson63 your driver works great with Ubuntu Server 20.04.3 LTS (5.11.0-37-generic #41~20.04.2-Ubuntu). Would it be possible to make it a dkms module? Thanks!

Edit: On Ubuntu Server I had to use /lib/modules to place the driver as the kernel headers seem to be there, not in /usr/lib/modules

@Koenkk Hi - I am having the same issue as described above with my Xiaomi motion sensors and Xiaomi wireless switch/button. I am running z2m on docker and have the latest version installed. I got the CC2652R and everything has been working fine until the recent update.

I am running my system on Ubuntu and my kernel version is : 4.15.0-159-generic. I got a philips hue motion sensor to replace my xiaomi one, and I also experience delays with the hue motion sensor.

Can you please help?

Hi,
I just wanted to share that I also have issues with my Xiaomi wireless motion sensors being very slow (while IKEA TRADFRI lamps work fine). I was also running 4.15.0-159-generic (Ubuntu Server 18.04.05 LTS) and your comment made me realise that the issue was indeed this bug. I've performed a rollback to 4.15.0-158-generic using GRUB and the issues with the Xiaomi motion sensors is fixed instantly.

So either perform a rollback or upgrade to 20.04 which is what I'll be doing soon :-)

So either perform a rollback or upgrade to 20.04 which is what I'll be doing soon :-)

20.04 has the same issue right now.

Hi on rpi4 who kernel must install ?
For the moment I have 5.10.73-v7l+

I try to install https://github.com/daverobertson63/CH341SER

but I have this error

make -C /lib/modules/5.10.73-v7l+/build M=/home/fennec/CH341SER
make[1]: *** /lib/modules/5.10.73-v7l+/build : Aucun fichier ou dossier de ce type. Arrรชt.
make: *** [Makefile:7: default] Error 2

thanks for help

Hi on rpi4 who kernel must install ? For the moment I have 5.10.73-v7l+

I try to install https://github.com/daverobertson63/CH341SER

but I have this error

make -C /lib/modules/5.10.73-v7l+/build M=/home/fennec/CH341SER make[1]: *** /lib/modules/5.10.73-v7l+/build : Aucun fichier ou dossier de ce type. Arrรชt. make: *** [Makefile:7: default] Error 2

thanks for help

If you can follow the nice tutorial I did over the weekend - then see if that helps. It would seem you dont have the kernel headers etc. If you have followed my tutorial and still get the error... oh dear. I am not sure I can help as this error looks similar to what you get when you dont load the kernel headers.

https://github.com/daverobertson63/CH341SER/blob/master/README.md

Hi on rpi4 who kernel must install ? For the moment I have 5.10.73-v7l+
I try to install https://github.com/daverobertson63/CH341SER
but I have this error
make -C /lib/modules/5.10.73-v7l+/build M=/home/fennec/CH341SER make[1]: *** /lib/modules/5.10.73-v7l+/build : Aucun fichier ou dossier de ce type. Arrรชt. make: *** [Makefile:7: default] Error 2
thanks for help

If you can follow the nice tutorial I did over the weekend - then see if that helps. It would seem you dont have the kernel headers etc. If you have followed my tutorial and still get the error... oh dear. I am not sure I can help as this error looks similar to what you get when you dont load the kernel headers.

https://github.com/daverobertson63/CH341SER/blob/master/README.md

I must install rpi-source

and after

sudo rpi-source

I can compile

I have another controller

Slaesh's CC2652RB stick

And i observe delay since few day just one time, after command ok other command send immediately

Do you thinks it's possible have same bug ?

usb 1-1.4: cp210x converter now attached to ttyUSB1

Thanks

Thanks, upgrading kernel helped!

For anyone on Ubuntu, to install new kernel (before it is released into mainline repo), you can follow instructions under Install Howto on this page: https://ubuntu.pkgs.org/20.04/ubuntu-proposed-main-amd64/linux-image-5.4.0-90-generic_5.4.0-90.101_amd64.deb.html

However, you might also want to install sudo apt-get install linux-modules-extra-5.4.0-90-generic before rebooting, otherwise some devices (such as network card) might not work.

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

Anyone aware if this is fixed in 5.11.0-41? (hwe on Ubuntu 20.04.3 LTS)

Can't confirm 5.11.0-41 to be much better. The delay seems to be gone, but at times, especially when you rely on multiple zigbee devices to react at the same time, it's still a mess.

Running the CH341X driver as DKMS module still works better.

I'm testing 5.10.83 for 24 hours already. Looks like problem has been fixed.
Also tried to use multiple zigbee devices at the same time, no delays, no problems.

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

I'm on 20.04 LTS and 5.4.0-104-generic. My zzh stick keeps timing out after a day. Is this the same issue? I can't fix it without a complete reboot and ejecting the stick.

Mar 23 15:46:42 OpenHAB3 npm[895]: (node:895) UnhandledPromiseRejectionWarning: Error: SRSP - ZDO - mgmtPermitJoinReq after 6000ms
Mar 23 15:46:42 OpenHAB3 npm[895]:     at Timeout._onTimeout (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/src/utils/waitress.ts:64:35)
Mar 23 15:46:42 OpenHAB3 npm[895]:     at listOnTimeout (internal/timers.js:554:17)
Mar 23 15:46:42 OpenHAB3 npm[895]:     at processTimers (internal/timers.js:497:7)
Mar 23 15:46:42 OpenHAB3 npm[895]: (node:895) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was>
Mar 23 15:50:02 OpenHAB3 npm[895]: (node:895) UnhandledPromiseRejectionWarning: Error: SRSP - ZDO - mgmtPermitJoinReq after 6000ms
Mar 23 15:50:02 OpenHAB3 npm[895]:     at Timeout._onTimeout (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/src/utils/waitress.ts:64:35)
Mar 23 15:50:02 OpenHAB3 npm[895]:     at listOnTimeout (internal/timers.js:554:17)
Mar 23 15:50:02 OpenHAB3 npm[895]:     at processTimers (internal/timers.js:497:7)
Mar 23 15:50:02 OpenHAB3 npm[895]: (node:895) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was

I'm on 20.04 LTS and 5.4.0-104-generic. My zzh stick keeps timing out after a day. Is this the same issue? I can't fix it without a complete reboot and ejecting the stick.

Mar 23 15:46:42 OpenHAB3 npm[895]: (node:895) UnhandledPromiseRejectionWarning: Error: SRSP - ZDO - mgmtPermitJoinReq after 6000ms
Mar 23 15:46:42 OpenHAB3 npm[895]:     at Timeout._onTimeout (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/src/utils/waitress.ts:64:35)
Mar 23 15:46:42 OpenHAB3 npm[895]:     at listOnTimeout (internal/timers.js:554:17)
Mar 23 15:46:42 OpenHAB3 npm[895]:     at processTimers (internal/timers.js:497:7)
Mar 23 15:46:42 OpenHAB3 npm[895]: (node:895) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was>
Mar 23 15:50:02 OpenHAB3 npm[895]: (node:895) UnhandledPromiseRejectionWarning: Error: SRSP - ZDO - mgmtPermitJoinReq after 6000ms
Mar 23 15:50:02 OpenHAB3 npm[895]:     at Timeout._onTimeout (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/src/utils/waitress.ts:64:35)
Mar 23 15:50:02 OpenHAB3 npm[895]:     at listOnTimeout (internal/timers.js:554:17)
Mar 23 15:50:02 OpenHAB3 npm[895]:     at processTimers (internal/timers.js:497:7)
Mar 23 15:50:02 OpenHAB3 npm[895]: (node:895) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was

I have the exact same problem since updating to 1.25.0 running the Home Assistent add-on on a Ubuntu machine.

@Heronimonimo this looks like a different issue. Open a new issue for this :)

I'm having various issues with stability recently where before (6 months to a year ago) all worked well. No idea what the cause is, whether the zzh firmware a bug in zigbee2mqtt or in the driver etc. I'm currently running kernel 5.15.0-1006 on Ubuntu 22.04 on a Raspberry Pi 4 having changed from Raspberry Pi OS where same issues and latency is in region of 2-3 seconds. Tried all the usual - reinstall from scratch, change channel (I have much better signal on channel 25 now) etc. Really hope the root problem can be identified and it gets fixed soon, whatever it is.