using rtkit with pipewire and realtime audio applications (fedora 34+)
fernandoll opened this issue · 1 comments
Pipewire is now the default audio system for Fedora 34+, and it replaces both PulseAudio and Jack2. It uses rtkit to switch threads to realtime scheduling if requested, using rtkit by default. My workstations (both at work and at home) are used for "professional" audio work using various Jack clients (Ardour, SuperCollider, etc), and that requires realtime scheduling for reliable operation. It is impossible to use rtkit "out of the box" in this context for several reasons :
- realtime priorities are hardwired and the maximum is very low (20 which is < 50 which is the default for "threadirqs" default priority for interrupt threads)
- maximum amount of realtime cpu usage is hardwired to a very low value (20% of total usage). This limitation is really bad as I want to be able to use a substantial amount of CPU for realtime DSP load (that is the purpose of the workstation!)
- rtkit does not even have the concept of SCHED_FIFO scheduling in it, it just forces threads to use SCHED_RR regardless of the requirements of the software that requests realtime access (I have been using SCHED_FIFO for jackd audio threads for a very long time without issues)
- I am still investigating this one, but inclusion of SCHED_RESET_ON_FORK interferes with some software's acquisition of realtime privileges. In particular SuperCollider's supernova synthesis engine which uses several threads to distribute DSP load among CPU cores. The first DSP thread acquires the right priority, all others fail presumably because they fork from the first one.
I do understand the purpose of rtkit, but right now it is not usable because all those limits are hardwired and cannot be changed by the administrator of the system. I cannot choose to not use reset on fork, or up the max realtime priority or cpu realtime usage. There has to be a mechanism by which the administrator of a workstation can change the defaults to something that will enable a particular workload.
Removing rtkit as a solution to these issues is not an option, for example in Fedora 34 pipewire system has a hard dependency on rtkit and cannot be installed without rtkit being also installed.
My only temporary workaround is to create patched rpm packages of rtkit where those limits are changed to defaults that are suitable for professional audio work, but this is not a good long term solution.
* realtime priorities are hardwired and the maximum is very low (20 which is < 50 which is the default for "threadirqs" default priority for interrupt threads)
Well, rtkit-daemon can be configured by adding command line options to rtkit-daemon.service, for instance:
ExecStart=/usr/libexec/rtkit-daemon --min-nice-level=-17 --threads-per-user-max=11
* I am still investigating this one, but inclusion of SCHED_RESET_ON_FORK interferes with some software's acquisition of realtime privileges. In particular SuperCollider's supernova synthesis engine which uses several threads to distribute DSP load among CPU cores. The first DSP thread acquires the right priority, all others fail presumably because they fork from the first one.
I'm also running into this with Anklang, the synthesis engine needs to run multiple threads (one per cpu core), and due to SCHED_RESET_ON_FORK, each one has to acquire a high-priority nice level on its own. The default rtkit limit for threads/user is 25, which means that on systems with 32 cores, not all threads can acquire high priorities (possibly leading to prio inversion...) Or on systems with 16 cores, starting Anklang twice within 20 seconds with trigger the rtkit burst limit and refuse to renice all threads of the second run, eventhough all threads from the first run are gone.
Now, to a certain extend, this is intended, e.g. to prevent rtkit misuse via fork bombs.
So given there are good reasons to keep SCHED_RESET_ON_FORK, the rtkit behaviour should be somehow adapting to the increasing number of cores modern CPUs have, after all the current defaults are more than 12 years old:
7704290 (@poettering) 2009-06-04 20:32:55 +0200 116) static unsigned threads_per_user_max = 25;
Here is an example that could scale with the cores present (the other limits would probably also need slight adjustments):
unsigned threads_per_user_max = 5 + 5 * sysconf (_SC_NPROCESSORS_ONLN);