foss-for-synopsys-dwc-arc-processors/linux

HS58: Linux occasionally does not boot with ATLD implementation of atomics

Opened this issue · 35 comments

After switching to ATLD atomics we see occasional hangs while the Linux boots. The last message that is seen on the console is:

Advanced Linux Sound Architecture Driver Initialized.

The behavior was first described here: #162

If I attach OpenOCD JTAG to the LPU the Linux continues booting.

For 2177 resets the issue happened 626 times, so repro rate is quite high 28%.

This seems to be buildroot (Linux) build specific, as this is quite easily reproducible with one buildroot build, but didn’t happen with another one (with changes not related to this issue).

I started the OpenOCD session and then restarted the board. It stopped booting at the “Advanced Linux Sound Architecture Driver Initialized." message. I opened OpenOCD console and typed few command:

> reg pc
pc (/32): 0x814e0c42

> targets
    TargetName         Type       Endian TapName            State
--  ------------------ ---------- ------ ------------------ ------------
 0* arc-em.lpu0        arcv3      little arc-em.lpu0        running
 1  arc-em.lpu1        arcv3      little arc-em.lpu1        running
 2  arc-em.lpu2        arcv3      little arc-em.lpu2        running

> reg pc
pc (/32): 0x814e0c42

>
>
> reg pc
pc (/32): 0x814e0c42

> targets 1
use 'arc-em.lpu1' as target identifier, not '1'
> reg pc
pc (/32): 0x81aa5b82

>

The last command (reg pc on the CPU 1) caused Linux to continue booting.

I was able to gather the CPU trace (smart trace). Interestingly, reading AUX regs needed to gather the trace didn’t cause the Linux to continue, unlike “reg pc“ command.

The last entry in the smart trace is:

entry 0:
	SRC = 0x814E0C42
	DST = 0x814E0C3E
	FLAGS = 0x80000400

and corresponds to the following instructions from the vmlinux listing assembler code (function multi_cpu_stop):

814e0c34:	272f 00bf           	clri	r2
814e0c38:	236f 10ff           	dmb	0x3
814e0c3c:	4082                	mov_s	r0,r20
814e0c3e:	202f 048c           	ex	r0,[r18]                                            <<<< DST
814e0c42:	08ff 8050           	breq	r0,0x1,-2	;814e0c3e <multi_cpu_stop+0xa2>     <<<< SRC
814e0c46:	236f 10ff           	dmb	0x3
814e0c4a:	4103                	mov_s	r1,r24
814e0c4c:	212f 0430           	atld.add	r1,[r16]

FLAGS = 0x80000400 means that the bit repeat (0x400) is set, so the CPU spins in the loop.

An ex instruction is one of the “Atomic Memory Operations (AMOs) - 32 bit“.

Could it be a prove that this is some kind of a deadlock?

hs58_defconfig.txt

I'm attaching example Linux defconfig that is used in our tests.

I attach SmaRT trace for all cores, as read by the Metaware Debugger

core-0-smart.txt
core-1-smart.txt
core-2-smart.txt

Thanks Jakub,

Looks like the other 2 cores stopped execution somewhere in the function rcu_dynticks_inc:

81aa5b28 <rcu_dynticks_inc>:
81aa5b28:	1cf8 b38e           	std.aw	r14r15,[sp,-8]
81aa5b2c:	c0f1                	push_s	blink
81aa5b2e:	46cb 8143 c680      	mov_s	r14,0x8143c680
81aa5b34:	0be6 0020           	bl.d	996	;81aa5f18 <debug_smp_processor_id>
81aa5b38:	4708                	mov_s	r15,r0
81aa5b3a:	41c3 81d3 eb84      	mov_s	r1,0x81d3eb84
81aa5b40:	2655 194e           	add2	r14,r14,0x25
81aa5b44:	21f0 0000           	ld.as	r0,[r1,r0]
81aa5b48:	661e                	add_s	r14,r14,r0
81aa5b4a:	236f 10ff           	dmb	0x3
81aa5b4e:	272f 00ff           	clri	r3
81aa5b52:	236f 10ff           	dmb	0x3
81aa5b56:	42c3 81d1 609c      	mov_s	r2,0x81d1609c
81aa5b5c:	710c                	mov_s	r0,1
81aa5b5e:	202f 008c           	ex	r0,[r2]
81aa5b62:	08ff 8050           	breq	r0,0x1,-2	;81aa5b5e <rcu_dynticks_inc+0x36>   <<<<<<<<<<<<<<<<< core2
81aa5b66:	236f 10ff           	dmb	0x3
81aa5b6a:	40e1                	mov_s	r0,r15
81aa5b6c:	202f 03b0           	atld.add	r0,[r14]
81aa5b70:	2000 03c0           	add	r0,r0,r15
81aa5b74:	236f 10ff           	dmb	0x3
81aa5b78:	702c                	mov_s	r1,0
81aa5b7a:	212f 008c           	ex	r1,[r2]
81aa5b7e:	236f 10ff           	dmb	0x3
81aa5b82:	262f 00ff           	seti	r3                      <<<<<<<<<<<<<<<<< core1
81aa5b86:	236f 10ff           	dmb	0x3
81aa5b8a:	c0d1                	pop_s	blink
81aa5b8c:	7fe0                	j_s.d	[blink]
81aa5b8e:	1408 358e           	ldd.ab	r14r15,[sp,8]
81aa5b92:	78e0                	nop_s

I added defconfig options for additional checking of the locking mechanisms:

CONFIG_TEST_LOCKUP=m
CONFIG_PROVE_LOCKING=y
CONFIG_PROVE_RAW_LOCK_NESTING=y
CONFIG_LOCK_STAT=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y

The boot log:
linux_boot_log_selftest.txt

The boot stalls after Advanced Linux Sound Architecture Driver Initialized. message.

The tests seem to pass: Good, all 391 testcases passed! |

This part shows some errors:

unwinder looping too long, aborting !
unwinder looping too long, aborting !
unwinder looping too long, aborting !
unwinder looping too long, aborting !
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.127 #10

Stack Trace:
  arc_unwind_core+0xf4/0x110
  dump_stack_lvl+0x4c/0x74
  register_lock_class+0x7f0/0x7f4
  __lock_acquire+0x68/0x2ae8
  lock_acquire+0xfc/0x34c
  _raw_spin_lock_irqsave+0x36/0x70
  complete+0x16/0x40
  rest_init+0x118/0x21c
  0x8158ffa4

Another symptom of the issue is that with ATLD the rcutorture module shows "process hang" warning. With previous LLSC option the rcutorture was working fine.

# modprobe rcutorture
rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=60 verbose=1 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 stall_cpu=0 stall_cpu_holdoff=10 stall_cpu_irqsoff=0 stall_cpu_block=0 n_barrier_cbs=0 onoff_interval=0 onoff_holdoff=0 read_exit_delay=13 read_exit_burst=16 nocbs_nthreads=0 nocbs_toggle=1000
rcu:  Start-test grace-period state: g26265 f0x0
rcu_torture_write_types: Testing conditional GPs.
rcu_torture_write_types: Testing expedited GPs.
rcu_torture_write_types: Testing asynchronous GPs.
rcu_torture_write_types: Testing polling GPs.
rcu_torture_write_types: Testing normal GPs.
rcu-torture: Creating rcu_torture_writer task
rcu-torture: Creating rcu_torture_fakewriter task
rcu-torture: rcu_torture_writer task started
rcu-torture: Creating rcu_torture_fakewriter task
rcu-torture: rcu_torture_fakewriter task started
rcu-torture: Creating rcu_torture_fakewriter task
rcu-torture: rcu_torture_fakewriter task started
rcu-torture: Creating rcu_torture_fakewriter task
rcu-torture: rcu_torture_fakewriter task started
rcu-torture: Creating rcu_torture_reader task
rcu-torture: rcu_torture_fakewriter task started
rcu-torture: Creating rcu_torture_reader task
rcu-torture: rcu_torture_reader task started
rcu-torture: Creating rcu_torture_stats task
rcu-torture: rcu_torture_reader task started
rcu-torture: Creating torture_shuffle task
rcu-torture: rcu_torture_stats task started
rcu-torture: Creating torture_stutter task
rcu-torture: torture_shuffle task started
rcu-torture: torture_stutter task started
rcu-torture: Creating rcu_torture_fwd_prog task
rcu-torture: Creating rcu_torture_read_exit task
rcu-torture: rcu_torture_fwd_progress task started
rcu-torture: rcu_torture_read_exit: Start of test
#
#
#
# rcu-torture: rcu_torture_read_exit: End of episode
rcu-torture: rcu_torture_read_exit: Start of episode
rcu-torture: rcu_torture_read_exit: End of episode
INFO: task torture_stutter:2207 blocked for more than 10 seconds.
      Tainted: G           O      5.15.127 #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:torture_stutter state:D stack:    0 pid: 2207 ppid:     2 flags:0x00000000

Stack Trace:
  __switch_to+0x0/0xa8
  __schedule+0x2c8/0x6c4
  schedule+0x50/0xf0
  schedule_hrtimeout_range_clock+0xd8/0x138
  torture_stutter+0x15a/0x1c0 [torture]
  kthread+0x10e/0x144
  ret_from_fork+0x14/0x18
rcu-torture: rcu_torture_read_exit: Start of episode
rcu-torture: rcu_torture_read_exit: End of episode
INFO: task torture_stutter:2207 blocked for more than 10 seconds.
      Tainted: G           O      5.15.127 #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:torture_stutter state:D stack:    0 pid: 2207 ppid:     2 flags:0x00000000

Stack Trace:
  __switch_to+0x0/0xa8
  __schedule+0x2c8/0x6c4
  schedule+0x50/0xf0
  schedule_hrtimeout_range_clock+0xd8/0x138
  torture_stutter+0x15a/0x1c0 [torture]
  kthread+0x10e/0x144
  ret_from_fork+0x14/0x18
rcu-torture: rcu_torture_read_exit: Start of episode
rcu-torture: rcu_torture_read_exit: End of episode

#
#
# rcu-torture: rtc: 36277d34 ver: 705 tfle: 0 rta: 706 rtaf: 0 rtf: 694 rtmbe: 0 rtmbkf: 0/537 rtbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 3520 barrier: 0/0:0 read-exits: 67 nocb-toggles: 0:0
rcu-torture: Reader Pipe:  16053121 784 0 0 0 0 0 0 0 0 0
rcu-torture: Reader Batch:  16051820 2085 0 0 0 0 0 0 0 0 0
rcu-torture: Free-Block Circulation:  705 704 703 700 699 698 697 696 695 694 0
rcu-torture: rcu_torture_read_exit: Start of episode
rcu-torture: rcu_torture_read_exit: End of episode

#
#
#
#
#
# INFO: task torture_stutter:2207 blocked for more than 10 seconds.
      Tainted: G           O      5.15.127 #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:torture_stutter state:D stack:    0 pid: 2207 ppid:     2 flags:0x00000000

Stack Trace:
  __switch_to+0x0/0xa8
  __schedule+0x2c8/0x6c4
  schedule+0x50/0xf0
  schedule_hrtimeout_range_clock+0xd8/0x138
  torture_stutter+0x15a/0x1c0 [torture]
  kthread+0x10e/0x144
  ret_from_fork+0x14/0x18

#
# rcu-torture: rcu_torture_read_exit: Start of episode
rcu-torture: rcu_torture_read_exit: End of episode

#
#
#
# rcu_torture_fwd_prog_nr: Duration 1431 cver 240 gps 455
rcu_torture_fwd_prog_cr Duration 12 barrier: 6 pending 37078 n_launders: 87940 n_launders_sa: 8925 n_max_gps: 100 n_max_cbs: 64339 cver 0 gps 9
rcu_torture_fwd_cb_hist: Callback-invocation histogram (duration 19 jiffies): 1s/10: 79016:6 2s/10: 73263:4
rcu-torture: rcu_torture_read_exit: Start of episode
rcu-torture: rcu_torture_read_exit: End of episode
INFO: task torture_stutter:2207 blocked for more than 10 seconds.
      Tainted: G           O      5.15.127 #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:torture_stutter state:D stack:    0 pid: 2207 ppid:     2 flags:0x00000000

Stack Trace:
  __switch_to+0x0/0xa8
  __schedule+0x2c8/0x6c4
  schedule+0x50/0xf0
  schedule_hrtimeout_range_clock+0xd8/0x138
  torture_stutter+0x15a/0x1c0 [torture]
  kthread+0x10e/0x144
  ret_from_fork+0x14/0x18
rcu-torture: rcu_torture_read_exit: Start of episode
rcu-torture: rcu_torture_read_exit: End of episode
INFO: task torture_stutter:2207 blocked for more than 20 seconds.
      Tainted: G           O      5.15.127 #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:torture_stutter state:D stack:    0 pid: 2207 ppid:     2 flags:0x00000000

Stack Trace:
  __switch_to+0x0/0xa8
  __schedule+0x2c8/0x6c4
  schedule+0x50/0xf0
  schedule_hrtimeout_range_clock+0xd8/0x138
  torture_stutter+0x15a/0x1c0 [torture]
  kthread+0x10e/0x144
  ret_from_fork+0x14/0x18
rcu-torture: rcu_torture_read_exit: Start of episode
rcu-torture: rcu_torture_read_exit: End of episode

#
#
#
#
#
#
#
# rmmod rcutorture
rcu-torture: Stopping torture_shuffle task
rcu-torture: Stopping rcu_torture_reader
rcu-torture: Stopping rcu_torture_reader
rcu-torture: Stopping torture_shuffle
rcu-torture: Stopping torture_stutter task
rcu-torture: Stopping torture_stutter
rcu: rcu_preempt: wait state: RCU_GP_WAIT_FQS(5) ->state: 0x402 ->rt_priority 0 delta ->gp_start 1 ->gp_activity 1 ->gp_req_activity 1 ->gp_wake_time 1 ->gp_wake_seq 39469 ->gp_seq 39473 ->gp_seq_needed 39480 ->gp_max 33 ->gp_flags 0x0
rcu-torture: Stopping rcu_torture_fakewriter
rcu:    rcu_node 0:2 ->gp_seq 39473 ->gp_seq_needed 39480 ->qsmask 0x0 .... ->n_boosts 0
rcu:    cpu 0 ->gp_seq_needed 39480
rcu:    cpu 1 ->gp_seq_needed 39480
rcu:    cpu 2 ->gp_seq_needed 39480
rcu: RCU callbacks invoked since boot: 278378
rcu_tasks: RTGS_WAIT_CBS(11) since 131868 g:0 i:0/0 k.
rcu_tasks_rude: RTGS_WAIT_CBS(11) since 131868 g:0 i:0/0 k.
rcu_tasks_trace: RTGS_WAIT_CBS(11) since 131868 g:0 i:0/0 k. N0 h:0/0/0
rcu-torture: Stopping rcu_torture_fakewriter
rcu-torture: Stopping rcu_torture_writer
rcu-torture: Stopping rcu_torture_fakewriter
rcu-torture: rtc: 00000000 VER: 1631 tfle: 0 rta: 1631 rtaf: 0 rtf: 1621 rtmbe: 0 rtmbkf: 0/1175 rtbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 7924 barrier: 0/0:0 read-exits: 152 nocb-toggles: 0:0
rcu-torture: Reader Pipe:  34942090 1744 0 0 0 0 0 0 0 0 0
rcu-torture: Reader Batch:  34939104 4728 0 0 0 0 0 0 0 0 0
rcu-torture: Free-Block Circulation:  1630 1630 1629 1628 1627 1625 1624 1623 1622 1621 0
rcu-torture: Stopping rcu_torture_stats
rcu-torture: Stopping rcu_torture_fakewriter
rcu-torture: Stopping rcu_torture_read_exit
rcu-torture: Stopping rcutorture_read_exit task
rcu-torture: Stopping rcu_torture_fwd_prog task
rcu_torture_fwd_prog: tested 1 tested_tries 2
rcu-torture: Stopping rcu_torture_fwd_prog
rcu-torture: Stopping rcu_torture_writer task
rcu-torture: Stopping rcu_torture_reader task
rcu-torture: Stopping rcu_torture_reader task
rcu-torture: Stopping rcu_torture_fakewriter task
rcu-torture: Stopping rcu_torture_fakewriter task
rcu-torture: Stopping rcu_torture_fakewriter task
rcu-torture: Stopping rcu_torture_fakewriter task
rcu:  End-test grace-period state: g39505 f0x0 total-gps=3310
rcu-torture: Stopping rcu_torture_stats task
mem_dump_obj() slab test: rcu_torture_stats = 00000000, &rhp = 86707f08, rhp = 880567f8, &z = ac1d9580
mem_dump_obj(ZERO_SIZE_PTR): zero-size pointer
mem_dump_obj(NULL): NULL pointer
mem_dump_obj(86707f08): non-slab/vmalloc memory
mem_dump_obj(880567f8): slab rcuscale start 880567f8 pointer offset 0
mem_dump_obj(880567fc): slab rcuscale start 880567f8 pointer offset 4
mem_dump_obj(ac1d9580): 11-page vmalloc region starting at 0xac1d1000 allocated at load_module+0xa1e/0x1ff8
mem_dump_obj() kmalloc test: rcu_torture_stats = 00000000, &rhp = 86707f08, rhp = 87533680
mem_dump_obj(kmalloc 87533680): slab kmalloc-128 start 87533680 pointer offset 0 size 128
mem_dump_obj(kmalloc 87533684): slab kmalloc-128 start 87533680 pointer offset 4 size 128
mem_dump_obj() vmalloc test: rcu_torture_stats = 00000000, &rhp = 86707f08, rhp = ac02c000
mem_dump_obj(vmalloc ac02c000): 1-page vmalloc region starting at 0xac02c000 allocated at rcu_torture_cleanup+0x49a/0x6b0 [rcutorture]
mem_dump_obj(vmalloc ac02c004): 1-page vmalloc region starting at 0xac02c000 allocated at rcu_torture_cleanup+0x49a/0x6b0 [rcutorture]
rcu-torture: rtc: 00000000 VER: 1631 tfle: 0 rta: 1631 rtaf: 0 rtf: 1621 rtmbe: 0 rtmbkf: 0/1175 rtbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 7924 barrier: 0/0:0 read-exits: 152 nocb-toggles: 0:0
rcu-torture: Reader Pipe:  34942090 1744 0 0 0 0 0 0 0 0 0
rcu-torture: Reader Batch:  34939104 4728 0 0 0 0 0 0 0 0 0
rcu-torture: Free-Block Circulation:  1630 1630 1629 1628 1627 1625 1624 1623 1622 1621 0
rcu-torture:--- End of test: SUCCESS: nreaders=2 nfakewriters=4 stat_interval=60 verbose=1 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 stall_cpu=0 stall_cpu_holdoff=10 stall_cpu_irqsoff=0 stall_cpu_block=0 n_barrier_cbs=0 onoff_interval=0 onoff_holdoff=0 read_exit_delay=13 read_exit_burst=16 nocbs_nthreads=0 nocbs_toggle=1000
#
#
#

I tried reproducing on nSim (multicore), but no luck so far.

I'm not sure how the mdb was able to read the smart trace data from all the cores, because from my observation if I read (using OpenOCD) SMART_CONTROL or SMART_DATA registers from core 1 or 2 the Linux immediately resumes booting. It does not have any effect only with core 0. So using OpenOCD I can grab only smart trace from core 0.

Reading status32 register on any core does not resume the Linux boot.

> reg status32
status32 (/32): 0x00080002

> targets 1
use 'arc-em.lpu1' as target identifier, not '1'
> reg status32
status32 (/32): 0x00080002

> targets 2
use 'arc-em.lpu2' as target identifier, not '2'
> reg status32
status32 (/32): 0x00080002

> targets 0
use 'arc-em.lpu0' as target identifier, not '0'
> reg pc
pc (/32): 0x814e0c42

> targets 2
use 'arc-em.lpu2' as target identifier, not '2'
> reg pc
pc (/32): 0x81aa5b82

>

After the last command on core 2 the CPUs were resumed.

Reading AUX regs 0x4, 0x6 (IDENTITY, PC) even on core 0 resumes the CPUs.
Reading AUX reg 0xA (STATUS32) on any core does not have any effect - the cluster is stalled.

My suspicion is that OpenOCD in case of some commands triggers another read or write, which has the effect described above.

There is the hypothesis that it could be a lockup in the CPU pipeline and JTAG read/write fills/flushes the pipeline and unlocks the CPUs. This hypothesis is still valid.

@abrodkin @xxkent are you able to reproduce it on your side on your FPGA system?

I ran Linux on HAPS and in nSIM with options you provided:
CONFIG_TEST_LOCKUP=m
CONFIG_PROVE_LOCKING=y
CONFIG_PROVE_RAW_LOCK_NESTING=y
CONFIG_LOCK_STAT=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y

In both cases there is "Stack Trace message" during boot, this means that this is SW issue:

Stack Trace:
  arc_unwind_core+0xf4/0x110
  dump_stack_lvl+0x4e/0x78
  register_lock_class+0x630/0x634
  __lock_acquire+0x68/0x1bfc
  lock_acquire+0xee/0x32c
  _raw_spin_lock_irqsave+0x40/0x78
  complete+0x16/0x40
  rest_init+0x118/0x21c
  0x81298eac
Running RCU-tasks wait API self tests 

Linux_boot_HAPS.txt
Linux_boot_nSIM.txt

It is quite strange that the deadlock test runs on only core_0 while core_1 and core_2 are in the boot loop. Thus, core_1 and core_2 don't participate in this test.

I will conduct this test with your hs58_defconfig.txt as the next step.

The issue (Linux occasionally does not boot with ATLD implementation of atomics) is not reproduced on HAPS. I see some other issue with boot, but it is most likely SW because it exists in nSIM too.

HAPS_Linux_boot.txt
nSIM_Linux_boot.txt

Thanks for checking this. How many resets have you performed? In our case repro rate is not 100%, so to be sure you will need to perform tens of resets.
Have you enabled similar defconfig options as per the defconfig I attached? I can do the opposite - to cut down the defconfig and see if there is any change.

I took exactly config you provided (hs58_defconfig.txt) and changed only CONFIG_LINUX_LINK_BASE=0x0, CONFIG_LINUX_RAM_BASE=0x0, CONFIG_LINUX_MAP_SIZE=0x60000000 and CONFIG_ARC_BUILTIN_DTB_NAME=haps_hs5x_idu.

I did about 6 attempts on HAPS. Ok, then I'll be running this throughout the day to get more attempts and if anything changes, I'll let you know.

A suggestion from Alexey was to check if with ATLD selected in menuconfig there are still ‘llock’ and ‘scond’ instructions in the listing. The result is no, there are none - grep'ing for these finds nothing

>grep --ignore-case llock vmlinux.lst
>grep --ignore-case scond vmlinux.lst
>

This issue is not reproducible with single-core SMP Linux.

I attach SmaRT trace for all cores, as read by the Metaware Debugger

core-0-smart.txt core-1-smart.txt core-2-smart.txt

Could you provide call stacks for each core from MDB?

Try this patch also please: 28e6344

Hi @xxkent, I was testing issue #168 with the 28e6344 patch applied but the sshd was freezing with the following error:

INFO: task sshd:208 blocked for more than 10 seconds.
      Tainted: G           O      5.15.127 #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:sshd            state:D stack:    0 pid:  208 ppid:   206 flags:0x00000001

Stack Trace:
  __switch_to+0x0/0xa8
  __schedule+0x272/0x608
  schedule+0x50/0xf0
  schedule_timeout+0x88/0xb0
  ldsem_down_read+0x128/0x238
  tty_ldisc_ref_wait+0x18/0x38
  tty_poll+0x2e/0xb4
  do_sys_poll+0x1f2/0x454
  sys_ppoll+0x76/0xb0
  EV_Trap+0xb8/0xbc

However, this problem can be reproduced on ssh running on localhost. Steps:

  • run ssh localhost,
  • repeatedly send some commands, e.g. uname,
  • sshd freezes after a while.

You can try reproducing it on your side, maybe it will help in solving the bugs.

The good news is that with patch 28e6344 I no longer can see boot-time freeze, tried over 2k times.

Could you provide call stacks for each core from MDB?

Yes, here new capture of SmaRT + call stacks for each core:
callstack-core0.txt
callstack-core1.txt
callstack-core2.txt
smart-core0.txt
smart-core1.txt
smart-core2.txt

This time I reproduced this by loading and starting vmlinux from MDB, result was the same.

Hi @jzbydniewski!

Thank you for this output it is very valuable. For clearer picture can you gather some additional information? Please provide a SmaRT trace and registry dump of all cores for a few additional runs to ensure we are in the same state every time we hang.

p.s. unfortunately commit 28e6344 can't be used as a final solution of the described issue, additional changes may be required due to #167 (comment) and moreover it may hide a real problem with CPU.

Hi @jzbydniewski!

Thank you for this output it is very valuable. For clearer picture can you gather some additional information? Please provide a SmaRT trace and registry dump of all cores for a few additional runs to ensure we are in the same state every time we hang.

Logs for next 5 reproductions:
logs.zip

I did two experiments with enabled hw counters, as below:

hwc count cond=iall cond=crun cond=illock cond=iscond cond=imematomic cond=dbgflush cond=bstall cond=always

1st experiment: issue reproduce, I stopped cores in mdb after <1s:

Core 0:
image

Core 1:
image

Core 2:
image

2nd experiment: issue reproduce, I stopped cores in mdb after ~10s:

Core 0
image

Core 1:
image

Core 2:
image

Regards,
Jakub

Should we also force ATLD setting (I don't see that option in the version I'm currently using) in buildroot menuconfig, not only linux menuconfig ?

image

I can see that resuming cores with use of MDB works (and Linux continues executing) only if I do not pass param
-off=flush_dcache
With this param passed I can see cores are still stuck (!) after I resume.

Looking at data caches and memory (still with -off=flush_dcache option set), I can see that address 0x81d6609c (const across reproductions with the same vmlinux), that seems to be used with atomic exchange "ex" instruction in core 1 and core 2, is 0x0 in memory, but 0x01 in core 2 data cache. Could this be causing this issue ?

cache contents:
dcache-core1.txt
dcache-core2.txt
dcache-core3.txt
l2-cache.txt

SmaRT:
smart-core1.txt
smart-core2.txt
smart-core3.txt

regs:
regs-core1.txt
regs-core2.txt
regs-core3.txt

image

I did a simple experiment in that state, simply changed the value at 0x81d6609c to 1 and resumed cores (so kind of manual flush) - as a result Linux booted to the console.

Conduct please the following experiment:

  1. Find location of a function "rcu_dynticks_inc()" in the memory
  2. In assembly find 2 "ex" instructions belonging to "rcu_dynticks_inc()"
  3. Write down the memory addresses where these "ex" instructions are located.
    Let them be 0xAAAAAAAA and 0xBBBBBBBB
  4. Go to the memory window by address 0xAAAAAAAA and 0xBBBBBBBB and write down opcode values.
    Let them be 0x11111111 and 0x22222222
  5. You need to patch 2 opcodes. 0x11111111->0x91111111 and 0x22222222->0xA2222222.
    That is set bit <31> to one. The assembly code should change form "ex %r0,[%r3]" to "ex.di %r0,[%r3]"
    (register numbers shouldn't be changed). ".di" means uncached access.
  6. Run Linux and check whether issue is still there

To patch automatically you can add theses commands to your commands.txt file:
evalq *((int *)0xAAAAAAAA)=0x91111111
evalq *((int *)0xBBBBBBBB)=0xA2222222

As the next step, if it won't help, you can also do the same for multi_cpu_stop() and _raw_spin_unlock_irqrestore().

@xxkent I tried this for the rcu_dynticks_inc() at the first place,

evalq *((int *)0x81af5b5e)=0x808c202f
evalq *((int *)0x81af5b7a)=0x808c212f

image

As a result I am getting crash

smp: Bringing up secondary CPUs ...
Idle Task [1] (ptrval)
Trying to bring up CPU1 ...

IDENTITY        : ARCVER [0x60] ARCNUM [0x1] CHIPID [ 0x0]
processor [1]   : HS58 (ARC32 ARCv3 ISA)  Dual-Issue
ISA Extn        : atomic (not used) atomic2 ll128 (not used) unalign mpy[opt 9] div64
MMU [v10]       : MMU32 hwalk 3 levels, 4k PAGE, JTLB 1024 uD/I 8/4
                 tlb_flush_mm Incr ASID
I-Cache         : 64K, 4way/set, 64B Line, VIPT aliasing
D-Cache         : 64K, 2way/set, 64B Line, PIPT
HW PF           : RD 2 WR 2 OUTS 2 AG 1 (disabled)
L2              : 768K, 64B Line
Timers          : Timer0 Timer1 RTC [UP 64-bit] GFRC [SMP 64-bit]
Vector Table    : 0x8050000
Extn [SMP]      : ARConnect (v3): 3 cores with IPI IDU DEBUG GFRC
archs-intc      : 3 priority levels (default 1)
## CPU1 LIVE ##: Executing Code...

Invalid Mem Access

Invalid Mem Access
Path: (null)
Path: (null)
Idle Task [2] (ptrval)
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.127 #7
Trying to bring up CPU2 ...
Bus Error from Data Mem

IDENTITY        : ARCVER [0x60] ARCNUM [0x2] CHIPID [ 0x0]
processor [2]   : HS58 (ARC32 ARCv3 ISA)  Dual-Issue
ISA Extn        : atomic (not used) atomic2 ll128 (not used) unalign mpy[opt 9] div64
MMU [v10]       : MMU32 hwalk 3 levels, 4k PAGE, JTLB 1024 uD/I 8/4
                 tlb_flush_mm Incr ASID
I-Cache         : 64K, 4way/set, 64B Line, VIPT aliasing
D-Cache         : 64K, 2way/set, 64B Line, PIPT
HW PF           : RD 2 WR 2 OUTS 2 AG 1 (disabled)
L2              : 768K, 64B Line
Timers          : Timer0 Timer1 RTC [UP 64-bit] GFRC [SMP 64-bit]
Vector Table    : 0x8050000
Extn [SMP]      : ARConnect (v3): 3 cores with IPI IDU DEBUG GFRC
archs-intc      : 3 priority levels (default 1)
ECR: 0x00011000 EFA: 0x81d6609c ERET: 0x81af5b62
STAT: 0x00080002
## CPU2 LIVE ##: Executing Code...
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.127 #7
Bus Error from Data Mem
ECR: 0x00011000 EFA: 0x81d6609c ERET: 0x81af5b62
STAT: 0x00080002 [  K     ]   BTA: 0x81af5b3a
 SP: 0x81cb5fb4  FP: 0x00000000 BLK: rcu_dynticks_inc+0x12/0x6c
r00: 0x00000001 r01: 0x81d8eb84 r02: 0x81d6609c
r03: 0x00000021 r04: 0x8e778000 r05: 0x00000006
r06: 0x00000008 r07: 0x98968000 r08: 0xffffffff
r09: 0xfffffff7 r10: 0x00000000 r11: 0x00000004
r12: 0xab815588

Stack Trace:
  rcu_dynticks_inc+0x3a/0x6c
  default_idle_call+0x1a/0x6c
  do_idle+0x80/0xfc
  cpu_startup_entry+0x14/0x18
  start_kernel+0x4a4/0x4c0

Invalid Mem Access
Path: (null)
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.127 #7
Bus Error from Data Mem
ECR: 0x00011000 EFA: 0x81d6609c ERET: 0x81af5b62
STAT: 0x00080002 [  K     ]   BTA: 0x81af5b3a
 SP: 0x81cb5fb4  FP: 0x00000000 BLK: rcu_dynticks_inc+0x12/0x6c
r00: 0x00000001 r01: 0x81d8eb84 r02: 0x81d6609c
r03: 0x00000021 r04: 0x8e778000 r05: 0x00000006
r06: 0x00000008 r07: 0x98968000 r08: 0xffffffff
r09: 0xfffffff7 r10: 0x00000000 r11: 0x00000004
r12: 0xab815588

Stack Trace:
  rcu_dynticks_inc+0x3a/0x6c
  default_idle_call+0x1a/0x6c
  do_idle+0x80/0xfc
  cpu_startup_entry+0x14/0x18
  start_kernel+0x4a4/0x4c0
 [  K     ]   BTA: 0x81af5b3a
 SP: 0x82065fd8  FP: 0x00000000 BLK: rcu_dynticks_inc+0x12/0x6c
r00: 0x00000001 r01: 0x81d8eb84 r02: 0x81d6609c
r03: 0x00000021 r04: 0x270e0000 r05: 0x00000007
r06: 0x00000008 r07: 0x98968000 r08: 0xffffffff
r09: 0xfffffff7 r10: 0x00000000 r11: 0x00000004
r12: 0x81affe98

Stack Trace:
  rcu_dynticks_inc+0x3a/0x6c
  default_idle_call+0x1a/0x6c
  do_idle+0x80/0xfc
  cpu_startup_entry+0x14/0x18
  first_lines_of_secondary+0x0/0x38

Invalid Mem Access
Path: (null)
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.15.127 #7
Bus Error from Data Mem
ECR: 0x00011000 EFA: 0x81d6609c ERET: 0x81af5b62
STAT: 0x00080002 [  K     ]   BTA: 0x81af5b3a
 SP: 0x82067fd8  FP: 0x00000000 BLK: rcu_dynticks_inc+0x12/0x6c
r00: 0x00000001 r01: 0x81d8eb84 r02: 0x81d6609c
r03: 0x00000021 r04: 0xbfa48000 r05: 0x00000007
r06: 0x00000008 r07: 0x98968000 r08: 0xffffffff
r09: 0xfffffff7 r10: 0x00000000 r11: 0x00000004
r12: 0x81affe98

Stack Trace:
  rcu_dynticks_inc+0x3a/0x6c
  default_idle_call+0x1a/0x6c
  do_idle+0x80/0xfc
  cpu_startup_entry+0x14/0x18
  first_lines_of_secondary+0x0/0x38

Thanks @jzbydniewski. This is quite unexpected for me that access to a global variable with .di gets "BUS FAULT". Let me to discuss with colleagues.

Is also interesting, that when I tried to apply the same for other function multi_cpu_stop I got unexpected result in dissasembly

evalq *((int *)0x81530c3e)=0xa02f048c

image

while memory window shows expected value..

image

This behavior is predictable since the BUS don't support atomic operation, means that we can't bypass this issue such way using .di.

evalq *((int *)0x81530c3e)=0xa02f048c

You swapped 0,1 bytes with 2,3. Disassembly shows bytes 0 and 1 in memory as high bytes and 2 and 3 as low bytes.

It looks that with CONFIG_ARC_HAS_LL64=n this issue is not showing up, but that might be just due to different timing I guess.