Issue with bus.send_periodic
YuBer0 opened this issue ยท 54 comments
Hi, I'm trying to send CAN messages with the function send_periodic. however i got the error
can.exceptions.CanOperationError: Couldn't send CAN BCM frame due to OS Error: Invalid argument You are probably referring to a non-existing frame. [Error Code 22]
The code i used is :
import can
import time
try:
bus = can.interface.Bus(channel = 'can0',
bustype = 'socketcan',
bitrate = 500000)
except OSError as e:
print(e)
message = can.Message(arbitration_id=0x37A, data=[0x0A, 0x00, 0x3B, 0x00, 0xFF, 0x0B, 0x00, 0x00], is_extended_id= False)
message1 = can.Message(arbitration_id=0x379, data=[0x0C, 0x00, 0x0A, 0x00, 0xFF, 0x00, 0x0A, 0x00], is_extended_id= False)
message2 = can.Message(arbitration_id=0x372, data=[0x00, 0xD0, 0x50, 0x80, 0xCC, 0x00, 0xAA, 0xB0], is_extended_id= False)
period = 0.1
while True:
bus.send_periodic(msgs = message, period = 0.1)
bus.send_periodic(msgs = message1, period = 0.1)
bus.send_periodic(msgs = message2, period = 0.1)
When i tried to send it via bus.send it seems to be able to work
import can
import time
bus = can.interface.Bus(channel='can0', bustype='socketcan')
message = can.Message(arbitration_id=0x37A, data=[0x0A, 0x00, 0x3B, 0x00, 0xFF, 0x0B, 0x00, 0x00], is_extended_id= False)
message1 = can.Message(arbitration_id=0x379, data=[0x0C, 0x00, 0x0A, 0x00, 0xFF, 0x00, 0x0A, 0x00], is_extended_id= False)
message2 = can.Message(arbitration_id=0x372, data=[0x00, 0xD0, 0x50, 0x80, 0xCC, 0x00, 0xAA, 0xB0], is_extended_id= False)
period = 0.1
while True:
bus.send(message)
time.sleep(period)
bus.send(message1)
time.sleep(period)
bus.send(message2)
time.sleep(period)
Here are some of the configurations that my CAN device is working on
RPI-4B 8GB Ram,
kernel version : 6.1.19-v8+
CAN transceiver device, MCP2515 (modified, changed VP230 for TJA1050)
ip -d -s link show can0
4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
link/can promiscuity 0 minmtu 0 maxmtu 0
can state ERROR-ACTIVE restart-ms 100
lsmod | grep spi
spidev 20480 2
spi_bcm2835 20480 0
ifconfig can0
can0: flags=193<UP,RUNNING,NOARP> mtu 16
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 10 (UNSPEC)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
/boot/config.txt
dtparam=spi=on
dtoverlay=mcp2515, spi0-0, interrupt=25,oscillator=8000000
dtoverlay=spi-dma
dtoverlay=spi-bcm2835
Upon boot up, RPI did say that it is unable to load spi-dma & bcm2835
failed to load dtoverlay=spi-dma
failed to load dtoverlay=spi-bcm2835
I'm pretty new to RPI and CAN devices as well as posting issues on github, so any advice would definitely help! Thanks
This repository is about datasheets of CAN IP cores, not about general support.
I'm not sure if the BCM
is compiled into the raspi Linux kernel by default. Please check if it's loaded, after booting check the kernel log: dmesg | grep -i can
.
Hi thank you for the prompt reply,
how should i shift the issue to general support?
as for kernel log: dmesg | grep - i can
the output is
CAN device driver interface
MCP251x spi0.0 can0: MCP2515 successfully initialized.
CAN: controller area network core
NET: Registered PF_CAN protocol family
IPv6: ADDRCONF(NETDEV_CHANGE): can0: link becomes ready
It doesn't show can: broadcast manager protocol
, so BCM is not compiled into the kernel. You have to recompile your kernel with CAN_BCM
enabled.
I'm sure, you'll find a low of documentation, if you search for "how to compile kernel for raspberry pi".
Hi, i managed to recompile my kernel with CAN_BCM enabled. via resources this link
i check the kernel log: dmesg | grep -i can
and the output i get is:
CAN device driver interface
MCP251x spi0.0 can0: MCP2515 successfully initialized.
CAN: controller area network core
NET: Registered PF_CAN protocol family
CAN: raw protocol
IPv6: ADDRCONF(NETDEV_CHANGE): can0: link becomes ready
CAN: broadcast manager protocol
However the same error persist when i try to use the function send_periodic
. Also when i reboot the system, it seems like the CAN: broadcast manger protocol
is not shown after i typed dmesg | grep -i can
in kernel
However the same error persist when i try to use the function
send_periodic
. Also when i reboot the system, it seems like theCAN: broadcast manger protocol
is not shown after i typeddmesg | grep -i can
in kernel
If the BCM is a module it's only loaded if needed, so after you run your test program the first time.
Try running as root, if this doesn't work, maybe @hartkopp can help you.
I see. this is helpful.
i managed to see the CAN: broadcast manger protocol is not shown after i typed dmesg | grep -i can in kernel now, however i still face the same error message:
can.exceptions.CanOperationError: Couldn't send CAN BCM frame due to OS Error: Invalid argument You are probably referring to a non-existing frame. [Error Code 22]
I tried the original code from #2 (comment) and only changed can0
to vcan0
. That resulted in a working setup.
But candump any -td
shows that the gap between the CAN frames is only about some micro seconds:
(000.000030) vcan0 37A [8] 0A 00 3B 00 FF 0B 00 00
(000.000045) vcan0 379 [8] 0C 00 0A 00 FF 00 0A 00
(000.000036) vcan0 37A [8] 0A 00 3B 00 FF 0B 00 00
(000.000015) vcan0 372 [8] 00 D0 50 80 CC 00 AA B0
(000.000040) vcan0 37A [8] 0A 00 3B 00 FF 0B 00 00
(000.000044) vcan0 379 [8] 0C 00 0A 00 FF 00 0A 00
(000.000044) vcan0 372 [8] 00 D0 50 80 CC 00 AA B0
This would be definitely too fast for a real CAN bus.
Additionally this looks wrong:
while True:
bus.send_periodic(msgs = message, period = 0.1)
bus.send_periodic(msgs = message1, period = 0.1)
bus.send_periodic(msgs = message2, period = 0.1)
You are creating a busy loop which continuously overwrites the current CAN_BCM setting to establish a periodic send job!
You likely wanted to have
bus.send_periodic(msgs = message, period = 0.1)
bus.send_periodic(msgs = message1, period = 0.1)
bus.send_periodic(msgs = message2, period = 0.1)
while True:
time.sleep(1)
which leads to this output of candump any -td
:
(000.100444) vcan0 37A [8] 0A 00 3B 00 FF 0B 00 00
(000.000030) vcan0 379 [8] 0C 00 0A 00 FF 00 0A 00
(000.000019) vcan0 372 [8] 00 D0 50 80 CC 00 AA B0
(000.100024) vcan0 37A [8] 0A 00 3B 00 FF 0B 00 00
(000.000095) vcan0 379 [8] 0C 00 0A 00 FF 00 0A 00
(000.000015) vcan0 372 [8] 00 D0 50 80 CC 00 AA B0
(000.100278) vcan0 37A [8] 0A 00 3B 00 FF 0B 00 00
(000.000019) vcan0 379 [8] 0C 00 0A 00 FF 00 0A 00
(000.000013) vcan0 372 [8] 00 D0 50 80 CC 00 AA B0
Hello Hartkopp, thank you for the reply!
I yes you are right regarding the while True loop. i changed the loop portion of the code to what you have suggested, however i still face the same error code.
I'm not too sure do i have to download or install any other libraries in order to use BCM on RPI with a physical CAN device?
Can you please check is it works with a virtual CAN interface in your setup (as I showed above) and check if the CAN traffic is analogue to my candump example?
We have to figure out if it is a CAN driver or BCM problem.
Sure, i just tried with your changes, and it's have the same error message.
I'm curious, as i manage to use can.send, does it mean that would be a chance where the CAN driver has issue?
Can you please post the output of lsmod | grep can
and ip -d -s link show vcan0
?
lsmod
looks good.
But there are no RX/TX packets on the vcan0 interface.
When I run your program on vcan0
instead of can0
it looks like this:
$ ip -d -s link show vcan0
3: vcan0: <NOARP,UP,LOWER_UP> mtu 72 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/can promiscuity 0 allmulti 0 minmtu 0 maxmtu 0
vcan numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536
RX: bytes packets errors dropped missed mcast
29936 3742 0 0 0 0
TX: bytes packets errors dropped carrier collsns
29936 3742 0 0 0 0
I'm unable to run the code, because when i try to run it, the err code 22 came up.
Can i check with you is there any other settings or configuration that i might have missed out on?
Please show the code you're trying to run. This work-for-me:
import can
import time
try:
bus = can.interface.Bus(channel = 'vcan0',
bustype = 'socketcan',
bitrate = 500000)
except OSError as e:
print(e)
message = can.Message(arbitration_id=0x37A, data=[0x0A, 0x00, 0x3B, 0x00, 0xFF, 0x0B, 0x00, 0x00], is_extended_id= False)
message1 = can.Message(arbitration_id=0x379, data=[0x0C, 0x00, 0x0A, 0x00, 0xFF, 0x00, 0x0A, 0x00], is_extended_id= False)
message2 = can.Message(arbitration_id=0x372, data=[0x00, 0xD0, 0x50, 0x80, 0xCC, 0x00, 0xAA, 0xB0], is_extended_id= False)
period = 0.1
bus.send_periodic(msgs = message, period = 0.1)
bus.send_periodic(msgs = message1, period = 0.1)
bus.send_periodic(msgs = message2, period = 0.1)
while True:
time.sleep(1)
But if this would have been started, why are there no RX/TX packets for vcan0 visible?
I dont think the code manage to get started due to the error. thus no RX/TX packets are out for vcan0
Can you create a log with strace
:
strace -o log python3 test.py
Mine looks like this:
ioctl(3, SIOCGIFINDEX, {ifr_name="vcan0", ifr_ifindex=6}) = 0
bind(3, {sa_family=AF_CAN, sa_data="\225U\6\0\0\0\213\352P\0\0\0\0\0\240\362.\1\0\0\0\0"}, 24) = 0
setsockopt(3, SOL_CAN_RAW, CAN_RAW_FILTER, "\0\0\0\0\0\0\0\0", 8) = 0
socket(AF_CAN, SOCK_DGRAM|SOCK_CLOEXEC, CAN_BCM) = 4
ioctl(4, SIOCGIFINDEX, {ifr_name="vcan0", ifr_ifindex=6}) = 0
connect(4, {sa_family=AF_CAN, sa_data="\225U\6\0\0\0\213\352P\0\0\0\0\0 Jq\1\0\0\0\0"}, 24) = 0
sendto(4, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 56, 0, NULL, 0) = -1 EINVAL (Invalid argument)
sendto(4, "\1\0\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 72, 0, NULL, 0) = 72
sendto(4, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 56, 0, NULL, 0) = -1 EINVAL (Invalid argument)
sendto(4, "\1\0\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 72, 0, NULL, 0) = 72
sendto(4, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 56, 0, NULL, 0) = -1 EINVAL (Invalid argument)
sendto(4, "\1\0\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 72, 0, NULL, 0) = 72
clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, {tv_sec=413541, tv_nsec=40898648}, NULL) = 0
clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, {tv_sec=413542, tv_nsec=40984424}, NULL) = 0
Interestingly there are EINVAL
errors, too, but my python seems to retry with a larger length.
Which python-can
version are you using? Try apt-cache policy python3-can
.
BTW: please copy/paste from your terminal, no need for screen shots.
Sure,
for strace -o log python3 test.py
,
strace -o log python3 /home/Test_code/zzzz.py
Traceback (most recent call last):
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 280, in send_bcm
return bcm_socket.send(data)
OSError: [Errno 22] Invalid argument
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/Test_code/zzzz.py", line 15, in <module>
bus.send_periodic(msgs = message, period = 0.1)
File "/home/.local/lib/python3.9/site-packages/can/bus.py", line 242, in send_periodic
self._send_periodic_internal(msgs, period, duration),
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 838, in _send_periodic_internal
task = CyclicSendTask(bcm_socket, task_id, msgs, period, duration)
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 350, in __init__
self._tx_setup(self.messages)
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 377, in _tx_setup
send_bcm(self.bcm_socket, header + body)
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 293, in send_bcm
raise can.CanOperationError(base + specific_message, error.errno) from error
can.exceptions.CanOperationError: Couldn't send CAN BCM frame due to OS Error: Invalid argument You are probably referring to a non-existing frame. [Error Code 22]
as for python-can when i use : apt-cache policy python3-can
python3-can:
Installed: (none)
Candidate: 3.3.2.final~github-2
Version table:
3.3.2.final~github-2 500
500 http://raspbian.raspberrypi.org/raspbian bullseye/main armhf Packages
but when i use pip show python-can
Name: python-can
Version: 4.1.0
Summary: Controller Area Network interface module for Python
Home-page: https://github.com/hardbyte/python-can
Author: python-can contributors
Author-email: None
License: LGPL v3
ioctl(3, SIOCGIFINDEX, {ifr_name="vcan0", ifr_ifindex=6}) = 0 bind(3, {sa_family=AF_CAN, sa_data="\225U\6\0\0\0\213\352P\0\0\0\0\0\240\362.\1\0\0\0\0"}, 24) = 0 setsockopt(3, SOL_CAN_RAW, CAN_RAW_FILTER, "\0\0\0\0\0\0\0\0", 8) = 0 socket(AF_CAN, SOCK_DGRAM|SOCK_CLOEXEC, CAN_BCM) = 4 ioctl(4, SIOCGIFINDEX, {ifr_name="vcan0", ifr_ifindex=6}) = 0 connect(4, {sa_family=AF_CAN, sa_data="\225U\6\0\0\0\213\352P\0\0\0\0\0 Jq\1\0\0\0\0"}, 24) = 0 sendto(4, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 56, 0, NULL, 0) = -1 EINVAL (Invalid argument) sendto(4, "\1\0\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 72, 0, NULL, 0) = 72 sendto(4, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 56, 0, NULL, 0) = -1 EINVAL (Invalid argument) sendto(4, "\1\0\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 72, 0, NULL, 0) = 72 sendto(4, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 56, 0, NULL, 0) = -1 EINVAL (Invalid argument) sendto(4, "\1\0\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 72, 0, NULL, 0) = 72 clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, {tv_sec=413541, tv_nsec=40898648}, NULL) = 0 clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, {tv_sec=413542, tv_nsec=40984424}, NULL) = 0
Interestingly there are
EINVAL
errors, too, but my python seems to retry with a larger length.
python-can obviously does a TX_READ (0x03) first and then does a TX_SETUP (0x01).
Don't know why. But when you read a non-existing element (in bcm_read_op() in bcm.c), you get -EINVAL ... which is correct.
Sure, for strace -o log python3 test.py, strace -o log python3 /home/continental/Test_code/zzzz.py
The strace log has to look like the example from @marckleinebudde
How can i go about it? Since we both use the same code, could the difference in python library versions or OS that we use affect it?
Please send a strace output as posted by @marckleinebudde here #2 (comment)
I'm not too sure if this is the output that you are looking for, as the strace output log is quite long.
I have attached the log
log.txt
If the output is via the terminal it can be seen from this reply
Sure, for strace -o log python3 test.py, strace -o log python3 /home/Test_code/zzzz.py
Traceback (most recent call last):
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 280, in send_bcm
return bcm_socket.send(data)
OSError: [Errno 22] Invalid argument
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/Test_code/zzzz.py", line 15, in
bus.send_periodic(msgs = message, period = 0.1)
File "/home/.local/lib/python3.9/site-packages/can/bus.py", line 242, in send_periodic
self._send_periodic_internal(msgs, period, duration),
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 838, in _send_periodic_internal
task = CyclicSendTask(bcm_socket, task_id, msgs, period, duration)
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 350, in init
self._tx_setup(self.messages)
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 377, in _tx_setup
send_bcm(self.bcm_socket, header + body)
File "/home/.local/lib/python3.9/site-packages/can/interfaces/socketcan/socketcan.py", line 293, in send_bcm
raise can.CanOperationError(base + specific_message, error.errno) from error
can.exceptions.CanOperationError: Couldn't send CAN BCM frame due to OS Error: Invalid argument You are probably referring to a non-existing frame. [Error Code 22]as for python-can when i use : apt-cache policy python3-can
python3-can:
Installed: (none)
Candidate: 3.3.2.finalgithub-2github-2 500
Version table:
3.3.2.final
500 http://raspbian.raspberrypi.org/raspbian bullseye/main armhf Packagesbut when i use pip show python-can
Name: python-can
Version: 4.1.0
Summary: Controller Area Network interface module for Python
Home-page: https://github.com/hardbyte/python-can
Author: python-can contributors
Author-email: None
License: LGPL v3
The interesting part is here:
socket(AF_CAN, SOCK_RAW|SOCK_CLOEXEC, CAN_RAW) = 3
setsockopt(3, SOL_CAN_RAW, 3, [1], 4) = 0
setsockopt(3, SOL_CAN_RAW, 4, [0], 4) = 0
setsockopt(3, SOL_CAN_RAW, 2, [536870911], 4) = 0
setsockopt(3, SOL_SOCKET, SO_TIMESTAMPNS_OLD, [1], 4) = 0
ioctl(3, SIOCGIFINDEX, {ifr_name="vcan0", }) = 0
bind(3, {sa_family=AF_CAN, sa_data="\24\367\5\0\0\0\0\0\0\0\224\354G\0\224\271\"\367\2\0\0\0"}, 24) = 0
setsockopt(3, SOL_CAN_RAW, 1, "\0\0\0\0\0\0\0\0", 8) = 0
socket(AF_CAN, SOCK_DGRAM|SOCK_CLOEXEC, CAN_BCM) = 4
ioctl(4, SIOCGIFINDEX, {ifr_name="vcan0", }) = 0
connect(4, {sa_family=AF_CAN, sa_data="-\367\5\0\0\0\314\276I\0\0\0\0\0\34\317\7\0\350^\v\0"}, 24) = 0
send(4, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0"..., 40, 0) = -1 EINVAL (Invalid argument)
send(4, "\1\0\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\240\206\1\0\1\0\0\0"..., 56, 0) = -1 EINVAL (Invalid argument)
This is the relevant part of the strace log:
socket(AF_CAN, SOCK_DGRAM|SOCK_CLOEXEC, CAN_BCM) = 4
ioctl(4, SIOCGIFINDEX, {ifr_name="vcan0", }) = 0
connect(4, {sa_family=AF_CAN, sa_data="-\367\5\0\0\0\314\276I\0\0\0\0\0\34\317\7\0\350^\v\0"}, 24) = 0
send(4, "\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0"..., 40, 0) = -1 EINVAL (Invalid argument)
send(4, "\1\0\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\240\206\1\0\1\0\0\0"..., 56, 0) = -1 EINVAL (Invalid argument)
As struct bcm_msg_head is 56 bytes (as you can see in Marc's log), I wonder why your setup sends 40/56 bytes instead of 56/72 ...
As struct bcm_msg_head is 56 bytes (as you can see in Marc's log), I wonder why your setup sends 40/56 bytes instead of 56/72 ...
It's a 32 bit user space....
@YuBer0 What's the output of uname -a
Linux pi 6.1.19-v8+ #1637 SMP PREEMPT Tue Mar 14 11:11:47 GMT 2023 aarch64 GNU/Linux
my pahole says:
struct bcm_msg_head {
__u32 opcode; /* 0 4 */
__u32 flags; /* 4 4 */
__u32 count; /* 8 4 */
/* XXX 4 bytes hole, try to pack */
struct bcm_timeval ival1; /* 16 16 */
struct bcm_timeval ival2; /* 32 16 */
canid_t can_id; /* 48 4 */
__u32 nframes; /* 52 4 */
struct can_frame frames[]; /* 56 0 */
/* size: 56, cachelines: 1, members: 8 */
/* sum members: 52, holes: 1, sum holes: 4 */
/* last cacheline: 56 bytes */
};
uname -a
Linux box 6.4.0-rc2 #2 SMP PREEMPT_DYNAMIC Wed May 17 17:12:02 CEST 2023 x86_64 GNU/Linux
@hartkopp 64 bit kernel with a 32 bit user space. Another issue due to MSG_CMSG_COMPAT
?
Hm, might be.
Arnd add this patch, where I hoped it would fix the things up ...
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ba61a8d9d7809
A 32 bit ARM kernel has the following layout:
See difference in struct bcm_timeval
struct bcm_msg_head {
__u32 opcode; /* 0 4 */
__u32 flags; /* 4 4 */
__u32 count; /* 8 4 */
struct bcm_timeval ival1; /* 12 8 */
struct bcm_timeval ival2; /* 20 8 */
canid_t can_id; /* 28 4 */
__u32 nframes; /* 32 4 */
/* XXX 4 bytes hole, try to pack */
struct can_frame frames[] __attribute__((__aligned__(8))); /* 40 0 */
/* size: 40, cachelines: 1, members: 8 */
/* sum members: 36, holes: 1, sum holes: 4 */
/* forced alignments: 1, forced holes: 1, sum forced holes: 4 */
/* last cacheline: 40 bytes */
} __attribute__((__aligned__(8)));
struct bcm_timeval {
long tv_sec;
long tv_usec;
};
and long
is 32 bit on 32 bit ARM.
Hm, might be.
Arnd add this patch, where I hoped it would fix the things up ...
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ba61a8d9d7809
Seems it's time to pick up that old thread.
It seems we have to evaluate if MSG_CMSG_COMPAT
is set and then treat the received message differently.
@YuBer0 the instant fix for you would be to compile your Kernel as 32 bit Kernel as you have a 32 bit user space on your system.
@hartkopp something like in: https://elixir.bootlin.com/linux/v6.3/source/net/bluetooth/hci_sock.c#L1422
@marckleinebudde I'm not sure if this helps as the COMPAT stuff is intended for CMSG handling.
But I would be fine to add some analogue code if it doesn't make the situation worse ;-D
@YuBer0 the instant fix for you would be to compile your Kernel as 32 bit Kernel as you have a 32 bit user space on your system.
awesome!
@hartkopp @marckleinebudde
thank you so much for working on this. really appreciate it!
I have some prototype code, will test tomorrow. gn8
Hey! just tested it out, it's working out perfectly. just wondering do i have to mark this as closed?
cause i noted that you guys would want to work on the library
Please leave this open, I want to try to fix 32 bit userspace on 64 bit kernels.
It doesn't work, as send()
gets no compat flag...
For the send()
path we can check the length and the consistency and can probably decide for one or the other. For the recv()
path we're out of luck. An ioctl()
interface would have been better, as there is a dedicated compat_ioctl()
callback in the struct proto_ops
.
Thanks for the investigation!
The question is if it helps to introduce a compat_ioctl()
when nobody knows and cares about it.
This issue was the first feedback on this potential problem after years - and to me it was some kind of an accident when compiling a 64 bit kernel on a 32 bit system and userland.
The good thing about your investigation:
There is always a send
or sendmsg
or sendto
before something can be received on that socket.
So we could check for the bcm_msg_head
size and then switch the entire socket (session) to 32/64 bit.
ps. I still wonder if it would be worth the effort or if we better add some documentation to describe this potential problem ...
Filedescriptors can be passed from one process to another :) So it's not a 100% solution.
I still wonder if it would be worth the effort or if we better add some documentation to describe this potential problem ...
Adding documentation is always good, but this is not a potential problem, this is a very real problem on 32 bit userspace on 64 bit kernels.
Filedescriptors can be passed from one process to another :) So it's not a 100% solution.
But then you would need to pass it to another process that has a different 32/64 architecture - is this a valid problem?
I still wonder if it would be worth the effort or if we better add some documentation to describe this potential problem ...
Adding documentation is always good, but this is not a potential problem, this is a very real problem on 32 bit userspace on 64 bit kernels.
Is it? When I get a RasPi or Debian OS image, then I get a consistent kernel with a consistent user land. And when I install additional packages or compile new stuff they share the identical word size.
So it would just lead to problems, if someone copies binaries from a different user land installation, right?
I assume that you built a 64 bit Kernel when following your referenced process here: #2 (comment)
We have a $CUSTOMER using a 32 bit user land on a 64 bit kernel (Though they are not using CAN, bus 100G Ethernet). It's a read world use case!
Hm, wasn't aware of such use-case.
I will take a look on how to handle auto-detecting 32/64 bit sized bcm_msg_head structures.
Feel free to use my WIP code: https://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next.git/log/?h=bcm
The if (msg->msg_flags & MSG_CMSG_COMPAT)
must be replaced by some function that copies the compat
header and checks its integrity.