sched-ext/scx

scx_rusty: Ends immediately after start when kernel compilation is running.

YUBY64 opened this issue · 1 comments

Rusty ends immediately after start when kernel compilation is running. If I start Rusty first then kernel compilation then Rusty works well. Other schedulers are unaffected.

13:25:47 [INFO] Running scx_rusty (build ID: 1.0.4-gaea431c x86_64-unknown-linux-gnu)
13:25:47 [INFO] NODE[00] mask= ffff
13:25:47 [INFO]  DOM[00] mask= ffff
13:25:48 [INFO] Rusty scheduler started! Run `scx_rusty --monitor` for metrics.

DEBUG DUMP
================================================================================

scx_rusty[105233] triggered exit kind 1025:
  scx_bpf_error (Failed to lookup task 105197)

Backtrace:
  scx_bpf_error_bstr+0xf9/0x1a0
  bpf_prog_3d1f5ab55b43878c_dom_xfer_task+0x61/0x13b0
  bpf_prog_bf2f195073ed7dd1_task_set_domain+0x153/0x1c9
  bpf_prog_f9ebe23c1ccd299e_task_pick_and_set_domain+0x4b/0x9f
  bpf_prog_a0805f20ddc55e97_rusty_set_cpumask+0x98/0xc2

CPU states
----------

CPU 0   : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=420843 pnt_seq=226345
          curr=scx_rusty[105233] class=ext_sched_class

 *R scx_rusty[105233] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

    scx_dump_state+0x71a/0x8e0
    scx_ops_error_irq_workfn+0x40/0x50
    irq_work_run_list+0x50/0x90
    irq_work_run+0x18/0x50
    __sysvec_irq_work+0x1c/0xb0
    sysvec_irq_work+0x64/0x80
    asm_sysvec_irq_work+0x1a/0x20
    scx_ops_enable.isra.0+0xaf0/0xfd0
    bpf_struct_ops_link_create+0x12c/0x180
    __sys_bpf+0x1b5d/0x2bb0
    __x64_sys_bpf+0x25/0x30
    do_syscall_64+0x82/0x190
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

CPU 2   : nr_run=1 flags=0x1 cpu_rel=1 ops_qseq=384398 pnt_seq=190977
          curr=cc1[105208] class=ext_sched_class

 *R cc1[105208] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 3   : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=341845 pnt_seq=177225
          curr=cc1[104903] class=ext_sched_class

 *R cc1[104903] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 4   : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=395776 pnt_seq=227199
          curr=cc1[104874] class=ext_sched_class

 *R cc1[104874] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 5   : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=352954 pnt_seq=184117
          curr=cc1[105225] class=ext_sched_class

 *R cc1[105225] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 6   : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=380843 pnt_seq=210021
          curr=cc1[105168] class=ext_sched_class

 *R cc1[105168] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 8   : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=448343 pnt_seq=271350
          curr=cc1[105263] class=ext_sched_class

 *R cc1[105263] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 9   : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=435832 pnt_seq=250146
          curr=cc1[104883] class=ext_sched_class

 *R cc1[104883] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 10  : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=430707 pnt_seq=221375
          curr=cc1[105240] class=ext_sched_class

 *R cc1[105240] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 11  : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=367479 pnt_seq=215503
          curr=cc1[105188] class=ext_sched_class

 *R cc1[105188] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 12  : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=418299 pnt_seq=282694
          curr=cc1[104974] class=ext_sched_class

 *R cc1[104974] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 13  : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=459418 pnt_seq=275355
          curr=cc1[105050] class=ext_sched_class

 *R cc1[105050] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 14  : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=431442 pnt_seq=231494
          curr=cc1[105202] class=ext_sched_class

 *R cc1[105202] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

CPU 15  : nr_run=1 flags=0x1 cpu_rel=0 ops_qseq=374937 pnt_seq=206075
          curr=cc1[105229] class=ext_sched_class

 *R cc1[105229] +0ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a) dsq_vtime=0
      cpus=ffff

================================================================================

Error: EXIT: scx_bpf_error (Failed to lookup task 105197)

I believe this is the same issue as described in #610