nabaztag2018/pynab

RFID tag handling broken on Pi Zero 2

Closed this issue · 8 comments

Reading or writing a tag doesn't seem to work on Pi Zero 2:

  • nabd gets stuck (no visible exceptions in /var/log/nabd.log) and cannot be stopped .
  • Eventually the whole system gets stuck and has to be power-cycled.

This looks like a a low-level problem with the cr14 driver.

If there is a problem with cr14 driver, there might be some log in dmesg or kernel.log. Did you get anything?

Also you can try and stop nab*d and interact directly with the cr14 driver by opening /dev/rfid0 or by using example scripts.

Also you can try and stop nab*d and interact directly with the cr14 driver by opening /dev/rfid0 or by using example scripts.

This was to be my next step.

If there is a problem with cr14 driver, there might be some log in dmesg or kernel.log. Did you get anything?

Not easy to get , since the whole system hangs.
I have the following in dmesg, but it appears (without apparently causing any harm?) even if cr14 driver is not used:

[11012.810983] WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1372 del_timer_sync+0x50/0x64
...
[11012.811459] [<7f0d5f10>] (tagtagtagear_irq_handler [tagtagtag_ears]) from [<80185fc0>] (__handle_irq_event_percpu+0xa8/0x268)
...
[11012.812054] ---[ end trace 585c9e0bce35d3c2 ]---

I have the following in dmesg, but it appears (without apparently causing any harm?) even if cr14 driver is not used:

It's apparently ears-related.
Still the DietPi armv7l image?

It's apparently ears-related.

Ears seem to behave fine, though.

Still the DietPi armv7l image?

Yes. I didn't have a chance yet to check the behaviour with the zero2_raspios image.

I switched from the zero2_dietpi to the zero2_raspios image.
Behaviour is the same:

  • nabd gets stuck (no visible exceptions in /var/log/nabd.log) and cannot be stopped .
  • Eventually the whole system gets stuck and has to be power-cycled.

Kernel log shows this:

Mar  5 21:04:38 Hazel kernel: [ 2295.957165] rcu: INFO: rcu_sched self-detected stall on CPU
Mar  5 21:04:38 Hazel kernel: [ 2295.957196] rcu: 	0-....: (2099 ticks this GP) idle=fb2/1/0x40000002 softirq=23066/23066 fqs=1049
Mar  5 21:04:38 Hazel kernel: [ 2295.957208] 	(t=2100 jiffies g=43757 q=110)
Mar  5 21:04:38 Hazel kernel: [ 2295.957219] NMI backtrace for cpu 0
Mar  5 21:04:38 Hazel kernel: [ 2295.957236] CPU: 0 PID: 4977 Comm: python Tainted: G        WC O      5.10.92-v7+ #1514
Mar  5 21:04:38 Hazel kernel: [ 2295.957244] Hardware name: BCM2835
Mar  5 21:04:38 Hazel kernel: [ 2295.957252] Backtrace:
Mar  5 21:04:38 Hazel kernel: [ 2295.957287] [<809f4a58>] (dump_backtrace) from [<809f4de8>] (show_stack+0x20/0x24)
Mar  5 21:04:38 Hazel kernel: [ 2295.957303]  r7:ffffffff r6:00000000 r5:60000193 r4:80fe5ef0
Mar  5 21:04:38 Hazel kernel: [ 2295.957322] [<809f4dc8>] (show_stack) from [<809f8ffc>] (dump_stack+0xcc/0xf8)
Mar  5 21:04:38 Hazel kernel: [ 2295.957344] [<809f8f30>] (dump_stack) from [<80666e94>] (nmi_cpu_backtrace+0xc8/0xf4)
Mar  5 21:04:38 Hazel kernel: [ 2295.957361]  r10:80000193 r9:80d0120c r8:80b02060 r7:80b02064 r6:00000000 r5:00000000
Mar  5 21:04:38 Hazel kernel: [ 2295.957372]  r4:00000000 r3:9f188448
Mar  5 21:04:38 Hazel kernel: [ 2295.957392] [<80666dcc>] (nmi_cpu_backtrace) from [<80666fe8>] (nmi_trigger_cpumask_backtrace+0x128/0x140)
Mar  5 21:04:38 Hazel kernel: [ 2295.957404]  r5:80f05ad4 r4:8010e094
Mar  5 21:04:38 Hazel kernel: [ 2295.957426] [<80666ec0>] (nmi_trigger_cpumask_backtrace) from [<8010ef8c>] (arch_trigger_cpumask_backtrace+0x20/0x24)
Mar  5 21:04:38 Hazel kernel: [ 2295.957440]  r7:80f76280 r6:80f05060 r5:80f05104 r4:00000000
Mar  5 21:04:38 Hazel kernel: [ 2295.957459] [<8010ef6c>] (arch_trigger_cpumask_backtrace) from [<809f69fc>] (rcu_dump_cpu_stacks+0x10c/0x144)
Mar  5 21:04:38 Hazel kernel: [ 2295.957479] [<809f68f0>] (rcu_dump_cpu_stacks) from [<8019cb7c>] (rcu_sched_clock_irq+0x7c0/0xa38)
Mar  5 21:04:38 Hazel kernel: [ 2295.957496]  r10:80f03d00 r9:1e083000 r8:00000000 r7:80e932c0 r6:80e932c0 r5:9ef162c0
Mar  5 21:04:38 Hazel kernel: [ 2295.957505]  r4:80f76280
Mar  5 21:04:38 Hazel kernel: [ 2295.957525] [<8019c3bc>] (rcu_sched_clock_irq) from [<801a85ac>] (update_process_times+0x70/0x9c)
Mar  5 21:04:38 Hazel kernel: [ 2295.957542]  r10:801bc3c8 r9:9ef0f640 r8:ffffe000 r7:00000216 r6:3e8e1124 r5:00000000
Mar  5 21:04:38 Hazel kernel: [ 2295.957551]  r4:ffffe000
Mar  5 21:04:38 Hazel kernel: [ 2295.957569] [<801a853c>] (update_process_times) from [<801bbcb4>] (tick_sched_handle+0x64/0x70)
Mar  5 21:04:38 Hazel kernel: [ 2295.957584]  r7:00000216 r6:91e1623b r5:83013e10 r4:9ef0f978
Mar  5 21:04:38 Hazel kernel: [ 2295.957599] [<801bbc50>] (tick_sched_handle) from [<801bc424>] (tick_sched_timer+0x5c/0xb8)
Mar  5 21:04:38 Hazel kernel: [ 2295.957616] [<801bc3c8>] (tick_sched_timer) from [<801a9104>] (__hrtimer_run_queues+0xe8/0x384)
Mar  5 21:04:38 Hazel kernel: [ 2295.957630]  r7:9ef0f680 r6:80f05058 r5:9ef0f640 r4:9ef0f978
Mar  5 21:04:38 Hazel kernel: [ 2295.957649] [<801a901c>] (__hrtimer_run_queues) from [<801aa040>] (hrtimer_interrupt+0x130/0x2b8)
Mar  5 21:04:38 Hazel kernel: [ 2295.957666]  r10:9ef0f790 r9:9ef0f7c0 r8:ffffffff r7:7fffffff r6:00000003 r5:20000193
Mar  5 21:04:38 Hazel kernel: [ 2295.957674]  r4:9ef0f640
Mar  5 21:04:38 Hazel kernel: [ 2295.957695] [<801a9f10>] (hrtimer_interrupt) from [<808303fc>] (arch_timer_handler_phys+0x38/0x40)
Mar  5 21:04:38 Hazel kernel: [ 2295.957712]  r10:7f3172a8 r9:83012000 r8:814de000 r7:000000c3 r6:814e19c0 r5:80fe6284
Mar  5 21:04:38 Hazel kernel: [ 2295.957720]  r4:814eef00
Mar  5 21:04:38 Hazel kernel: [ 2295.957739] [<808303c4>] (arch_timer_handler_phys) from [<8018bd50>] (handle_percpu_devid_irq+0x8c/0x21c)
Mar  5 21:04:38 Hazel kernel: [ 2295.957758] [<8018bcc4>] (handle_percpu_devid_irq) from [<80184f7c>] (generic_handle_irq+0x44/0x54)
Mar  5 21:04:38 Hazel kernel: [ 2295.957772]  r7:00000001 r6:00000000 r5:00000000 r4:80e91d90
Mar  5 21:04:38 Hazel kernel: [ 2295.957791] [<80184f38>] (generic_handle_irq) from [<801856dc>] (__handle_domain_irq+0x6c/0xc4)
Mar  5 21:04:38 Hazel kernel: [ 2295.957811] [<80185670>] (__handle_domain_irq) from [<801012c8>] (bcm2836_arm_irqchip_handle_irq+0x60/0x64)
Mar  5 21:04:38 Hazel kernel: [ 2295.957828]  r9:83012000 r8:83013f58 r7:83013e44 r6:ffffffff r5:80000013 r4:80a0404c
Mar  5 21:04:38 Hazel kernel: [ 2295.957845] [<80101268>] (bcm2836_arm_irqchip_handle_irq) from [<80100abc>] (__irq_svc+0x5c/0x7c)
Mar  5 21:04:38 Hazel kernel: [ 2295.957855] Exception stack(0x83013e10 to 0x83013e58)
Mar  5 21:04:38 Hazel kernel: [ 2295.957868] 3e00:                                     836f65e4 00000000 00000fb8 00000fb7
Mar  5 21:04:38 Hazel kernel: [ 2295.957885] 3e20: 00000014 836f6040 50baf3a0 83013f58 83013f58 836f4040 7f3172a8 83013e6c
Mar  5 21:04:38 Hazel kernel: [ 2295.957899] 3e40: 83013e70 83013e60 7f31731c 80a0404c 80000013 ffffffff
Mar  5 21:04:38 Hazel kernel: [ 2295.957929] [<80a0400c>] (_raw_spin_lock) from [<7f31731c>] (cr14_write+0x74/0x67c [cr14])
Mar  5 21:04:38 Hazel kernel: [ 2295.957956] [<7f3172a8>] (cr14_write [cr14]) from [<80335150>] (vfs_write+0xd4/0x464)
Mar  5 21:04:38 Hazel kernel: [ 2295.957974]  r10:7f3172a8 r9:00000000 r8:83013f58 r7:50baf3a0 r6:80f05008 r5:00000014
Mar  5 21:04:38 Hazel kernel: [ 2295.957983]  r4:83d13600
Mar  5 21:04:38 Hazel kernel: [ 2295.958003] [<8033507c>] (vfs_write) from [<80335644>] (ksys_write+0x68/0xe0)
Mar  5 21:04:38 Hazel kernel: [ 2295.958020]  r10:00000014 r9:50baf3a0 r8:83d13600 r7:00000000 r6:0000005c r5:83d13601
Mar  5 21:04:38 Hazel kernel: [ 2295.958029]  r4:80f05008
Mar  5 21:04:38 Hazel kernel: [ 2295.958048] [<803355dc>] (ksys_write) from [<803356d4>] (sys_write+0x18/0x1c)
Mar  5 21:04:38 Hazel kernel: [ 2295.958065]  r10:00000004 r9:83012000 r8:80100204 r7:00000004 r6:00000014 r5:76f07e40
Mar  5 21:04:38 Hazel kernel: [ 2295.958074]  r4:0000000b
Mar  5 21:04:38 Hazel kernel: [ 2295.958092] [<803356bc>] (sys_write) from [<80100040>] (ret_fast_syscall+0x0/0x28)
Mar  5 21:04:38 Hazel kernel: [ 2295.958102] Exception stack(0x83013fa8 to 0x83013ff0)
Mar  5 21:04:38 Hazel kernel: [ 2295.958116] 3fa0:                   0000000b 76f07e40 0000000b 50baf3a0 00000014 00000000
Mar  5 21:04:38 Hazel kernel: [ 2295.958132] 3fc0: 0000000b 76f07e40 00000014 00000004 0000000b 50baf3a0 00000000 01f68908
Mar  5 21:04:38 Hazel kernel: [ 2295.958144] 3fe0: 00000000 7e816d28 00000000 76eb4ec0

Problem occurs on Pi Zero 2 both with "zero2" and "zero" Pynab images.
This seems to confirm the problem is with the armv7l version of the cr14 driver / kernel.

Fixed. We just need to rebuild images to pick up latest version of CR14 driver.

Thanks. I validated the fix on Pi Zero 2 with my current image by updating/rebuilding driver after switching to cr14 master branch.
All appears to be fine now, both for configuring and reading tags.

Can you please please push the fix to cr14 release branch, so that can I rebuild my images (pguyot/cr14#5)