unstable transmission of data: sequence number of consecutive frames get messed up
Closed this issue · 3 comments
Hello,
I'm using an UDS client that uses the can-isotp Linux kernel module as the transport layer to carry out a firmware update on an ECU.
After requesting a download and receiving a confirmation from the ECU I proceed to transfer blocks of 3074 bytes over CAN-ISOTP to deliver the firmware to the ECU. For this, I call write(int fd, const void *buf, size_t count)
with the file descriptor obtained from calling socket(AF_CAN, SOCK_DGRAM | SOCK_NONBLOCK, CAN_ISOTP)
and bound to an specified address. This transfer is performed by a development board with a Linux Gateway (to which I'm connected over SSH) to the ECU. I sniff out the whole bus traffic with the PCAN-USB adapter (https://www.peak-system.com/PCAN-USB.199.0.html?&L=1) connected to an Ubuntu VM with the candump
command from can-utils.
That's when I observe that after the eleventh consecutive frame (CF) is sent, the next sequence number is not right (0x0
instead of 0xC
) and further down it gets even worse; some sequence numbers get sent twice or three times. It also does not transmit all required 3074 bytes. Here the example "candumped" trace (IDs hidden):
(255.316173) can0 18DAXXF1 [8] 1C 02 36 01 5B 47 65 6E // start of transfer, (0xC02) is for 3074 bytes
(255.318129) can0 18DAF1XX [8] 30 00 00 FF FF FF FF FF // flow control from ECU
(255.319999) can0 18DAXXF1 [8] 21 65 72 61 6C 49 6E 66 // sending consecutive frames
(255.319999) can0 18DAXXF1 [8] 22 6F 5D 0D 0A 54 59 50
(255.322550) can0 18DAXXF1 [8] 23 45 20 3D 20 30 78 30
(255.324832) can0 18DAXXF1 [8] 24 30 30 38 0D 0A 42 4F
(255.327750) can0 18DAXXF1 [8] 25 41 52 44 20 3D 20 22
(255.327750) can0 18DAXXF1 [8] 26 78 43 55 2D 54 48 48
(255.327751) can0 18DAXXF1 [8] 27 22 0D 0A 5B 4D 53 57
(255.330786) can0 18DAXXF1 [8] 28 43 6F 6E 74 65 6E 74
(255.330787) can0 18DAXXF1 [8] 29 5D 0D 0A 3A 32 30 30
(255.333698) can0 18DAXXF1 [8] 2A 30 30 30 30 30 37 34
(255.333698) can0 18DAXXF1 [8] 2B 37 34 35 46 36 36 37 // all sequence numbers were alright until here
(255.335666) can0 18DAXXF1 [8] 20 30 32 30 30 30 30 36
(255.335667) can0 18DAXXF1 [8] 26 31 46 41 35 44 33 34
(255.335667) can0 18DAXXF1 [8] 23 33 41 44 45 44 43 46
(255.337927) can0 18DAXXF1 [8] 21 39 34 43 39 43 35 34 // sequence number 1 three times
(255.337928) can0 18DAXXF1 [8] 21 30 41 30 30 30 32 32
(255.337928) can0 18DAXXF1 [8] 21 32 33 38 38 36 31 44
(255.340456) can0 18DAXXF1 [8] 20 32 46 39 43 32 39 43 // sequence number 0 two times
(255.340456) can0 18DAXXF1 [8] 20 30 44 33 39 30 41 34
(255.342514) can0 18DAXXF1 [8] 2F 44 33 37 44 39 41 39
(255.342514) can0 18DAXXF1 [8] 20 45 42 32 31 45 33 42
I have come to a few ideas why this might not be working like e.g. that there is maybe interference on the CAN bus (physical reason) because I don't want to think that there might be something wrong with this kernel module. After double-checking my code I came to the realization that it all happens after the already mentioned write()
call and then the kernel module takes care of the ISOTP part of the transfer.
I would highly appreciate any input as to why this might be happening - hardware or software related. Thanks in advance!
Realized my txqueuelen was set to 10. That's why beginning from the 11th frame things got messed up. Solved by setting a bigger value with:
ip link set dev can0 txqueuelen 4096
Realized my txqueuelen was set to 10. That's why beginning from the 11th frame things got messed up. Solved by setting a bigger value with:
ip link set dev can0 txqueuelen 4096
Hi @muehlke ,
in Linux 5.18+ the CAN frame handling has been reworked which makes the txqueuelen
tweak obsolete:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/net/can?h=linux-5.18.y&id=4b7fe92c06901f4563af0e36d25223a5ab343782
If you are able to upgrade your Linux kernel (e.g. with Ubuntu ppa packages), I would suggest to update to the latest longterm kernel Linux 6.1.x which has some more stability improvements in can-isotp.