hamadmarri/cacule-cpu-scheduler

RDB schedutil support

raykzhao opened this issue Β· 31 comments

Hi @hamadmarri

It would be nice to have the latest RDB support the schedutil CPU frequency governor, which would be helpful on battery-powered devices. Since the autogroup-testing RDB patch on 5.12 can already support schedutil, I am wondering if it could be possible to port necessary changes to 5.13. To avoid those re-enabled statistics messing up the performance when schedutil is disabled in Kconfig, maybe we can add something like #if defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) on those statistics.

Currently I am using the latest autogroup-testing RDB patch synced to 5.13 and schedutil works fine on my laptop.
rdb-autogroup-testing_5.13.zip

@raykzhao

Do yoou ot the same behavior on the CFS 5.13 ?

Hi @raykzhao

I am revisiting the autogroup patch/branch again. I noticed that you have some modifications such as removing some update_blocked stats __update_blocked_others and other functions. Have you added these changes or was it an old patch I posted?

Thank you

Hi @hamadmarri

Those changes were synced from your RDB 5.13 patch. After I applied the RDB autogroup-testing patch on top of CacULE 5.13, it showed those functions were unused when compiling the kernel.

Hi @raykzhao

Could you please test this patch, I have only added

cfs_rq_util_change
cpufreq_update_util

In almost all places that are used in CFS.
In my machine I don't know how to reproduce the cpu freq issue you have.
Please let me know if it is fixed, otherwise I will add all update stats to rdb

Thank oyu

rdb-cpufreq.zip

Hi @hamadmarri,

Unfortunately with this fix my CPU is still locked at the highest frequency with schedutil.

By the way, I am not using intel-pstate because it never worked on my laptop since 5.12. In order to test schedutil, maybe you can try either compiling the kernel with CONFIG_X86_INTEL_PSTATE=n, or trying the intel_pstate=passive kernel parameter in order to enable the ACPI cpufreq (and its frequency governors) on Intel CPUs.

Hi @hamadmarri,

Unfortunately with this fix my CPU is still locked at the highest frequency with schedutil.

By the way, I am not using intel-pstate because it never worked on my laptop since 5.12. In order to test schedutil, maybe you can try either compiling the kernel with CONFIG_X86_INTEL_PSTATE=n, or trying the intel_pstate=passive kernel parameter in order to enable the ACPI cpufreq (and its frequency governors) on Intel CPUs.

Hey,

1 month ago, I got a I7 8700k and a kernel with the pstate patches and in boot options intel_pstate=passive or hwp (don’t know correctly anymore) I didn’t faced any problems even with over clock or with rdb kernel.

Hi @ptr1337,

Could you please provide a link to the pstate patches you are talking about? I'm happy to see if it could fix the pstate problem on my laptop.

Hi @hamadmarri,

I have just tested the latest cacule-5.13-rdb-autogroup-testing branch, and schedutil works fine just as the old RDB autogroup testing patch on 5.12. In addition, the autogroup with this branch feels much better i.e. more responsive under heavy load compared to the previous 5.12 RDB testing patch with autogroup enabled.

However, I got the following kernel warning in dmesg:

[    0.831884] ------------[ cut here ]------------
[    0.831885] !entity_is_task(se)
[    0.831887] WARNING: CPU: 2 PID: 314 at kernel/sched/fair.c:301 __enqueue_entity+0x299/0x2b0
[    0.831893] Modules linked in: dm_mirror dm_region_hash dm_log
[    0.831896] CPU: 2 PID: 314 Comm: systemd-udevd Not tainted 5.13.3 #1
[    0.831899] Hardware name: Acer Nitro AN515-51/Freed_KLS, BIOS V1.13 12/26/2017
[    0.831900] RIP: 0010:__enqueue_entity+0x299/0x2b0
[    0.831903] Code: 6c 24 30 48 89 5d 00 49 89 5e 08 eb d3 48 89 e8 48 89 cd eb bb c6 05 14 6d 36 01 01 48 c7 c7 b2 a2 15 9c 31 c0 e8 97 2c fc ff <0f> 0b e9 31 fe ff ff 48 89 5d 40 eb a5 00 00 cc cc 00 00 cc cc 00
[    0.831905] RSP: 0018:ffff9c66804ffcc8 EFLAGS: 00010046
[    0.831907] RAX: 427b339304f9b000 RBX: ffff8db642d55028 RCX: 427b339304f9b000
[    0.831908] RDX: 00000000ffffefff RSI: 0000000000000002 RDI: 0000000000000004
[    0.831910] RBP: ffff8db8fed22b80 R08: 0000000000000000 R09: ffffffff9c453c20
[    0.831911] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8db642d55000
[    0.831912] R13: 0000000000000000 R14: ffff8db8fed22b80 R15: 0000000031f7926f
[    0.831913] FS:  00007fccc49b0740(0000) GS:ffff8db8fed00000(0000) knlGS:0000000000000000
[    0.831915] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.831916] CR2: 000055946bf07288 CR3: 000000027f42a001 CR4: 00000000003706e0
[    0.831918] Call Trace:
[    0.831920]  enqueue_entity+0x140/0x2c0
[    0.831922]  enqueue_task_fair.llvm.17820080379486730809+0xe1/0x720
[    0.831925]  ? propagate_entity_cfs_rq+0x250/0x430
[    0.831928]  sched_move_task+0x14f/0x2e0
[    0.831931]  autogroup_move_group+0xa5/0x150
[    0.831933]  sched_autogroup_create_attach+0xf0/0x190
[    0.831936]  ksys_setsid+0xde/0xf0
[    0.831938]  __x64_sys_setsid+0x5/0x10
[    0.831940]  do_syscall_64+0x64/0x80
[    0.831944]  ? do_user_addr_fault+0x24d/0x5f0
[    0.831947]  ? asm_exc_page_fault+0x8/0x30
[    0.831949]  ? vtime_user_enter+0xf/0xb0
[    0.831951]  ? __context_tracking_enter+0x48/0x60
[    0.831953]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    0.831956] RIP: 0033:0x7fccc4bdf5e7
[    0.831957] Code: 73 01 c3 48 8b 0d 81 c8 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 70 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 c8 0e 00 f7 d8 64 89 01 48
[    0.831959] RSP: 002b:00007ffd3525f1b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000070
[    0.831961] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fccc4bdf5e7
[    0.831962] RDX: 00007fccc49b0a00 RSI: 00007fccc4d77020 RDI: 0000000000000001
[    0.831963] RBP: 0000000000000000 R08: 0000000000000000 R09: 000055946c240720
[    0.831964] R10: 00007fccc4b29128 R11: 0000000000000206 R12: 000055946beeae30
[    0.831965] R13: 000055946beea17d R14: 0000000000000000 R15: 000055946c2272a0
[    0.831967] ---[ end trace 24fb4358abd519f3 ]---

Hi @hamadmarri,

I have just tested the latest cacule-5.13-rdb-autogroup-testing branch, and schedutil works fine just as the old RDB autogroup testing patch on 5.12. In addition, the autogroup with this branch feels much better i.e. more responsive under heavy load compared to the previous 5.12 RDB testing patch with autogroup enabled.

However, I got the following kernel warning in dmesg:

[    0.831884] ------------[ cut here ]------------
[    0.831885] !entity_is_task(se)
[    0.831887] WARNING: CPU: 2 PID: 314 at kernel/sched/fair.c:301 __enqueue_entity+0x299/0x2b0
[    0.831893] Modules linked in: dm_mirror dm_region_hash dm_log
[    0.831896] CPU: 2 PID: 314 Comm: systemd-udevd Not tainted 5.13.3 #1
[    0.831899] Hardware name: Acer Nitro AN515-51/Freed_KLS, BIOS V1.13 12/26/2017
[    0.831900] RIP: 0010:__enqueue_entity+0x299/0x2b0
[    0.831903] Code: 6c 24 30 48 89 5d 00 49 89 5e 08 eb d3 48 89 e8 48 89 cd eb bb c6 05 14 6d 36 01 01 48 c7 c7 b2 a2 15 9c 31 c0 e8 97 2c fc ff <0f> 0b e9 31 fe ff ff 48 89 5d 40 eb a5 00 00 cc cc 00 00 cc cc 00
[    0.831905] RSP: 0018:ffff9c66804ffcc8 EFLAGS: 00010046
[    0.831907] RAX: 427b339304f9b000 RBX: ffff8db642d55028 RCX: 427b339304f9b000
[    0.831908] RDX: 00000000ffffefff RSI: 0000000000000002 RDI: 0000000000000004
[    0.831910] RBP: ffff8db8fed22b80 R08: 0000000000000000 R09: ffffffff9c453c20
[    0.831911] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8db642d55000
[    0.831912] R13: 0000000000000000 R14: ffff8db8fed22b80 R15: 0000000031f7926f
[    0.831913] FS:  00007fccc49b0740(0000) GS:ffff8db8fed00000(0000) knlGS:0000000000000000
[    0.831915] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.831916] CR2: 000055946bf07288 CR3: 000000027f42a001 CR4: 00000000003706e0
[    0.831918] Call Trace:
[    0.831920]  enqueue_entity+0x140/0x2c0
[    0.831922]  enqueue_task_fair.llvm.17820080379486730809+0xe1/0x720
[    0.831925]  ? propagate_entity_cfs_rq+0x250/0x430
[    0.831928]  sched_move_task+0x14f/0x2e0
[    0.831931]  autogroup_move_group+0xa5/0x150
[    0.831933]  sched_autogroup_create_attach+0xf0/0x190
[    0.831936]  ksys_setsid+0xde/0xf0
[    0.831938]  __x64_sys_setsid+0x5/0x10
[    0.831940]  do_syscall_64+0x64/0x80
[    0.831944]  ? do_user_addr_fault+0x24d/0x5f0
[    0.831947]  ? asm_exc_page_fault+0x8/0x30
[    0.831949]  ? vtime_user_enter+0xf/0xb0
[    0.831951]  ? __context_tracking_enter+0x48/0x60
[    0.831953]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    0.831956] RIP: 0033:0x7fccc4bdf5e7
[    0.831957] Code: 73 01 c3 48 8b 0d 81 c8 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 70 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 c8 0e 00 f7 d8 64 89 01 48
[    0.831959] RSP: 002b:00007ffd3525f1b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000070
[    0.831961] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fccc4bdf5e7
[    0.831962] RDX: 00007fccc49b0a00 RSI: 00007fccc4d77020 RDI: 0000000000000001
[    0.831963] RBP: 0000000000000000 R08: 0000000000000000 R09: 000055946c240720
[    0.831964] R10: 00007fccc4b29128 R11: 0000000000000206 R12: 000055946beeae30
[    0.831965] R13: 000055946beea17d R14: 0000000000000000 R15: 000055946c2272a0
[    0.831967] ---[ end trace 24fb4358abd519f3 ]---

It is because the recent addition of cn_has_idle_policy

static inline int cn_has_idle_policy(struct cacule_node *se)
{
	return task_has_idle_policy(task_of(se_of(se)));
}

I need to check if se is a task first.

I think this error must be also present on cacule without rdb but with autogroup

I am fixing it
Thank you

@raykzhao

Please check the fix hamadmarri/linux@3933f70

Thank you

@raykzhao

did you tested the pstate patch ? Did it helped you ?

@raykzhao

did you tested the pstate patch ? Did it helped you ?

@ptr1337

I am compiling the RDB+autogroup, I hope this also fixes the bad performance last time I had with it.

@raykzhao

did you tested the pstate patch ? Did it helped you ?

Hi @ptr1337,

Unfortunately it seems that intel-pstae is buggy on my laptop. All CPU cores except CPU0 are locked at their lowest frequencies under load, although CPU0 is at the turbo frequency.

I suspect this might be caused by the full dynticks. Someone else observed similar behaviour #32 (reply in thread). However, it seems that in their case, the CPU cores with full dynticks enabled can scale their frequencies.

Hi @raykzhao

Would you suggest to make RDB+recent autogroup work as default?
I noticed that RDB+autogroup is more responsive than plain RDB, but
I am worried about overall performance that plain RDB provides if added autogroup to it.

In my machine and some tests I did, I got totally different results which are confusing

plain RDB: wins on stress-ng (with noticeable margin)
RDB+autgroup: wins on make -j5 compiling the kernel (with also noticeable margin!)

I noticed monitoring in htop that plain RDB distribute tasks better but this is not a reliable factor
since htop reading could be not very precise and also since autogroup already grouped tasked in such away some tasks don't have enought IS values to be migrated to the (seemingly relaxed cpu). I am still not quite confident with RDB+autogroup and need your feedback.

Thank you

Ill just need a final patch for real testing it. testing takes time.

Hi @raykzhao

Would you suggest to make RDB+recent autogroup work as default?
I noticed that RDB+autogroup is more responsive than plain RDB, but
I am worried about overall performance that plain RDB provides if added autogroup to it.

In my machine and some tests I did, I got totally different results which are confusing

plain RDB: wins on stress-ng (with noticeable margin)
RDB+autgroup: wins on make -j5 compiling the kernel (with also noticeable margin!)

I noticed monitoring in htop that plain RDB distribute tasks better but this is not a reliable factor
since htop reading could be not very precise and also since autogroup already grouped tasked in such away some tasks don't have enought IS values to be migrated to the (seemingly relaxed cpu). I am still not quite confident with RDB+autogroup and need your feedback.

Thank you

Hi @hamadmarri,

I think probably it is a better idea to have a patch file ready and let more people test it, as @ptr1337 suggested. I feel at the current stage more testing under different loads e.g. gaming, multimedia etc. is still necessary before making the decision. You may open a new discussion thread to advertise it and collect feedback once the patch file is ready.

Ill just need a final patch for real testing it. testing takes time.

I am working on it, I am just testing a small changes I did in try_push_any function, I will upload a full cacule+rdb+autogroup patch also another patch has only rdb+autogroup

@raykzhao

did you tried to disable your pstate + cstate in your bios ? And check if the problem still happens.

Hi @raykzhao

Could you please try this patch on top of the recent rdb

rdb-cpufreq-gov-schedutil.zip

Please let me know if the cpu freq got fixed with schedutil

Thanks

Hi @hamadmarri,

Unfortunately all CPU cores are still locked under the highest frequency when using schedutil with this fix.

Since most only intel users got this problem, go through this documentation.

https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html

And also check your temps + which govenour is really used.

I didn’t faced this problem on a I7 8700k as long as I overclocked it and set all cores to 5.x ghz.

with using avx -3 value the frequency scaling did worked without problems depending on the power which was needed.

Also my cpu didn’t got over 60C.

@hamadmarri, I can confirm that my cpu (AMD Ryzen 9 3700X) is stuck on high frequency (i think called base clock, not boosted) 3,6GHz with schedutil, too. With ondemand, idle cores throttle down to 2.2GHz. I'm not using nohz_full now.

@hamadmarri, I can confirm that my cpu (AMD Ryzen 9 3700X) is stuck on high frequency (i think called base clock, not boosted) 3,6GHz with schedutil, too. With ondemand, idle cores throttle down to 2.2GHz. I'm not using nohz_full now.

Yep, i have owned now a ryzen 5900x, and depends on the govenour i set, but it stocks on 3.6ghz (which is normal base clock and normal behaviour), if you change the powersave or ondemand it also clocks down to 1,8-2,2 GHz.

Its just a problem with the pstate driver and intel turbo boost/speedshift.

@raykzhao

did you changed any settings in that behaviour in your bios ?

Sorry I forgot very important part on the last patch which was update_cfs_rq_load_avg

Please test this patch instead.

rdb-cpufreq-gov-schedutil-2.zip

I am so sorry because I can't confirm a solution by myself, on my machine even current RDB shows normal cpu freq when switching to schedutil, also I tried to switch to different governors back and forth with different loads all worked good. However, based on the fact the rdb+autogroup was good, then it is a matter of updating loads. This patch hopefully will work with you guys since it is almost the same as autogroup one. If it doesn't work then I would ask you please to try cacule without rdb and without fair_group. Maybe the real reason is disabling fair_group itself.

Thank you for testing

Hi @hamadmarri,

schedutil should be working with the latest fix on my laptop now. Thanks!

@raykzhao @hamadmarri

How about the normal the normal govenour ? Since i see in the patch only addition to schedutil.
@raykzhao do you have with another govenour the same issue or only with schedutil?

Regards.

Hi @hamadmarri,

schedutil should be working with the latest fix on my laptop now. Thanks!

Hi @raykzhao

Good to hear,

I will patch the rdb with the latest fix πŸ‘

Thank you

@raykzhao @hamadmarri

How about the normal the normal govenour ? Since i see in the patch only addition to schedutil.
@raykzhao do you have with another govenour the same issue or only with schedutil?

Regards.

Hi @ptr1337

The issue was only happening when using schedutil with RDB.

@hamadmarri, I have tested cacule+rdb after the latest commit. Efficiency seems good (almost 100% cpu usage on kernel compile) and schedutil freq governor (on AMD) seems to be working correctly now. I will try to run this for a while (games, video, background load) and report later. Good work and thank you.