Option to set IRQ affinity
Closed this issue · 7 comments
Is your feature request related to a problem? Please describe.
I observed what is happening with my system's battery life for some time. Adding nohz_full and irq_nocbs changed things a little, but what really improved battery life then was moving IRQs to E-cores.
I have a laptop with Core Ultra 155H which features two super effective cores in addition to usual P and E cores on new Intels. These two are 20 and 21, so I did for f in /proc/irq/*/smp_affinity_list; do echo 20-21 > $f; done
to move as much IRQs to LPE cores. As a result, battery life seemingly improved by 20+%. I can't back it up with statistics yet but what I saw in several hours was pretty exciting.
Describe the solution you'd like
My use case: moving interrupts to energy-efficient cores on battery mode.
In a simplest form I see this as a one new variable in a config, IRQ_AFFINITY_LIST_BAT
, which in my case would be 20-21
. When tlp detects battery mode, it moves all IRQs to these cores if variable is defined. Then when AC is plugged, default affinity is restored (need to think how to do this, can you just write 0 there?).
Should IRQ_AFFINITY_LIST_AC also be available? Personally I don't have use case for it if there is a simple method to automatically move back IRQs to all CPUs.
Is some kind of IRQ blacklist needed? My system works fine after trying to move all interrupts. Surely, some of them just declined to move, guess it means that blacklist is not needed.
Describe alternatives you've considered
The only automatic alternative that I thought about is moving IRQs at boot with a small systemd oneshot service. It will not be able to move IRQs back on AC though.
Hi,
first of all, I am of the opinion that the Intel developers should provide reasonable, energy-saving defaults in the kernel. But of course that doesn't help in the short term.
Since I don't own any hardware with P- and E-Cores, I couldn't test it myself, which makes the development quite tedious. However, I would be open to a well-tested pull request from your side.
I think it would be best if the code could determine the E cores automatically so that the user doesn't have to do it themselves. Then you would only need a distinction Y/N.
Incidentally, I do consider both an _ON_BAT
and an _ON_AC
parameter to be necessary. There will always be users who want the higher performance for AC.
ps. We also need a way for tlp-stat -p
to display the status in a concise form.
There will always be users who want the higher performance for AC.
Default affinity "all" pretty much does it, but if we add IRQ affinity setting on battery, makes sense to also add for AC for people like audio engineers or gamers who may want move interrupts from some isolated cores.
I think it would be best if the code could determine the E cores automatically so that the user doesn't have to do it themselves.
True. How do you see it, a few special values in addition to normal affinity lists as they are describes in kernel docs? Like "all", "p-cores", "e-cores" and "lpe-cores" which tlp will replace with "0-21", "0-11, "12-19" and "20-21" on my machine respectively.
"5,14-15,lpe-cores", for example, will then be passed as "5,14-15,20-21" — what do you think?
Hope I won't need to delve into Intel Thread Director for this one, it's not even in the mainline kernel yet.
Also, just for the lols, my 155H cpu actually has more like 4 classes of cores: P-cores 0-1 (0-3 logical) have higher turbo frequency than the rest. Guess we will understand what to do with this fact along the way.
Started to implement it here https://github.com/linrunner/TLP/compare/main...vient:TLP:add-irq-affinity-option?expand=1
Currently it supports basic affinity lists (without any preprocessing, so no cpu classes yet) and include/exclude list.
Seems to work fine on my machine (I've set IRQ_AFFINITY_LIST_ON_BAT=20-21
and IRQ_AFFINITY_EXCLUDE="204 225"
)
Would appreciate if you take a quick look and say that I'm not doing something terribly wrong there 🙂
Now, supporting aliases like "low-power" is completely different work.. My current plan is to find where lscpu -e
takes MAXMHZ info, and split cores in tiers based on that. For example, my system shows
$ lscpu -e
CPU NODE SOCKET CORE ONLINE MAXMHZ MINMHZ MHZ
0 0 0 0 yes 4800.0000 400.0000 1989.3700
1 0 0 0 yes 4800.0000 400.0000 400.0000
2 0 0 1 yes 4800.0000 400.0000 400.0000
3 0 0 1 yes 4800.0000 400.0000 400.0000
4 0 0 2 yes 4600.0000 400.0000 400.0000
5 0 0 2 yes 4600.0000 400.0000 400.0000
6 0 0 3 yes 4600.0000 400.0000 400.0000
7 0 0 3 yes 4600.0000 400.0000 400.0000
8 0 0 4 yes 4600.0000 400.0000 400.0000
9 0 0 4 yes 4600.0000 400.0000 400.0000
10 0 0 5 yes 4600.0000 400.0000 400.0000
11 0 0 5 yes 4600.0000 400.0000 400.0000
12 0 0 6 yes 3800.0000 400.0000 400.0000
13 0 0 7 yes 3800.0000 400.0000 400.0000
14 0 0 8 yes 3800.0000 400.0000 400.0000
15 0 0 9 yes 3800.0000 400.0000 400.0000
16 0 0 10 yes 3800.0000 400.0000 400.0000
17 0 0 11 yes 3800.0000 400.0000 400.0000
18 0 0 12 yes 3800.0000 400.0000 400.0000
19 0 0 13 yes 3800.0000 400.0000 400.0000
20 0 0 14 yes 2500.0000 400.0000 400.0000
21 0 0 15 yes 2500.0000 400.0000 400.0000
which should result in 4 tiers: tier0=0-3 tier1=4-11 tier2=12-19 tier3=20-21
Also tierMin is needed, I think, as an alias to tier3 in my case
Did not see a proper way to find P,E,LPE cores yet, may be possible by reading some MSRs
True. How do you see it, a few special values in addition to normal affinity lists as they are describes in kernel docs? Like "all", "p-cores", "e-cores" and "lpe-cores" which tlp will replace with "0-21", "0-11, "12-19" and "20-21" on my machine respectively.
Fine if your code does the translation.
Did not see a proper way to find P,E,LPE cores yet, may be possible by reading some MSRs
No additional external tools which are not already present in every Linux installation. lscpu
from core-utils is fine I guess. Apart from that, just what a shell script can do.
Please add your code to 10-tlp-func-cpu. I suggest you create a branch apart from main in your fork.
Reference, what Intel engineers think about differentiating P and E cores in OpenVINO https://github.com/openvinotoolkit/openvino/blob/releases/2024/0/src/inference/src/os/lin/lin_system_conf.cpp#L693
No, lost interest