karlstav/cava

Pipewire linking issues and a solution

Closed this issue ยท 11 comments

First off apologies for not following the bug template but I'm sure you will see that I'm not taking short cuts ;)
I seem to have found the cause and a fix for this issue, and several others which had not been reported yet, but it needs some special handling to do it right. There is a TL;DR at the bottom. Following is some history and references.

cava's source setting, is used to fill the target.object property of cava's pipewire node, informing the session manager of where to link it.

#557 removed the erroneous setting of the target.object property to the string "auto". Because "auto" was not a valid node.name, the search for the node failed, the fallback was used, stream.capture.sink = true is not used to search for a fallback, so it used the highest priority device which had output ports and that's the default input device.

Ironically, manually selecting that same fallback input device (the mic) as cava's source, would have failed, because stream.capture.sink = true was set, and those nodes do not have input ports so they do not have monitor ports. Likewise, if the same default output device (speakers) were selected manually rather than using 'auto' to fall back to it, because stream.capture.sink = true was set, it would have worked fine, rather than falling back to the default input device.

#557 also set stream.capture.sink = true only for the default device when specified with "auto". This made cava work with the "auto" source (target.object unset), and coincidentally, because stream.capture.sink = true was unset for them, also made cava work with any input device nodes or filter playback nodes. These would not function otherwise because they lack the monitor ports which are needed for stream.capture.sink = true.

However, lacking the stream.capture.sink = true property to indicate to link to monitor ports, made cava not work with the default device set manually, any non-default output device, or filter capture nodes.

#422 (comment) Was close but not quite correct, which will be why it didn't work for everybody. Using the node.serial of the node, will work, even without stream.capture.sink = true. Using the node.id is not a real thing and does not work, we can only use the name or serial, however the node.id and node.serial will often correspond, in which case that suggestion of using the node.id will 'coincidentally' work.

This is by design, as using node.serial is intended to be more specific than just the name, so it 'tries harder' to link there, and will link to monitor ports even if stream.capture.sink = true is not set. In the case that stream.capture.sink = true is not set, and the less-specific node.name is used as the target.object, if the node has monitor ports, they will not be considered for linking.

The wireplumber doc states that

Monitor ports are created on nodes that have input ports (i.e. sinks and capture streams)

Which corresponds to a media.class of Audio/Sink or Stream/Input/Audio

And in the case of sinks and capture streams, cava could not link to those input ports anyway, so the monitor is the only remaining option.

I have tested this in ugly patches to cava and could connect to the input monitor ports and output ports of loopback, filter, application, and device nodes. Here's my testing notes from the current version (copypasta from the config file):

method = pipewire
; default (auto) works (stream.capture.sink is set)
; source = auto
; same default (non-auto) specified as object.serial, works
; source = 188
; same default (non-auto) specified as node.name broken, falls back to default input device
; source = alsa_output.pci-0000_07_00.1.pro-output-3
; non-default (ie non-auto) output broken, falls back to default input device
; source = alsa_output.pci-0000_07_00.1.pro-output-8

; filter outputs work
; source = Master Bus Out
; filter inputs fail (falls back to default output device)
; source = Master Bus In
; filter inputs with pulse syntax fail (falls back to default output device)
; source = Master Bus In.monitor

; input devices work
; source = alsa_input.usb-C-Media_Electronics_Inc._USB_PnP_Sound_Device-00.pro-input-0
; non-default input devices work
; source = alsa_input.usb-USB_Camera_USB_Camera_SN0001-02.pro-input-0

; app outputs work (gets the last node if there are dupes)
; source = Firefox

; invalid nodes fall back to the default input device
; source=foo

TL;DR

If stream.capture.sink = true is set for cava's node, the target node's input monitor ports are linked. If it is not, the target node's output ports are linked.
If the relevant ports are not present, linking will fail, and the fallback device will be used.
Accordingly, stream.capture.sink = true must be set depending on the properties of the target node. Without this, it is not possible for cava to connect to many of our nodes.

To solve this, it is required to do something like:

  • Automatic - read the properties of the target object, and if the target node's media.class property is equal to Audio/Sink or Stream/Input/Audio, set stream.capture.sink = true on cava's node (other more specific monitor port detection schemes do work but don't seem to have any advantage)
  • Manual - Allow the user a control (argument) to set stream.capture.sink = true for cava's node, which they can do when they want to link to monitor ports
  • Hybrid/compatible - Use the pulseaudio naming schema where the target node name is specified with the .monitor suffix, set stream.capture.sink = true when it is present and strip the suffix before writing the remainder as the target node name.

I hope this is helpful, if there's anything I could do to assist please let me know.

The wireplumber doc states that

Monitor ports are created on nodes that have input ports (i.e. sinks and capture streams)

Which corresponds to a media.class of Audio/Sink or Stream/Input/Audio

It occurs to me, that there is another media.class which was not mentioned, but would apply here: 'Audio/Duplex'.
These are both sink and source, and being a sink, they'd get sink monitor ports. This is an interesting case for cava because a user might want to monitor the source ports (sound coming out of the node) or the sink monitor ports (sound going into the node). In that case, it's not really possible to do any kind of automatic detection, because it's down to user preference.

So, given automatic handling can't work, the easiest manual handling seems to be the pulseaudio-compatible approach, of using the .monitor suffix to indicate that we want to link to the input monitor ports (I also considered ':monitor' because that's what pipewire uses for the port names, or both... I went with the familiar option, for now)

I did quickly patch cava for this and it works well. I tested on all combinations of:

mono, stereo, and 5.1
input (real and virtual), loopback, filter, smartfilter, output (real and virtual)
sink monitors and sources (real and virtual)
default devices specified manually

There is a small 'catch' with doing it this way. 'auto' should target the default input device, because the default output device is a sink only and thus requires stream.capture.sink = true, so 'auto.monitor' would be the more accurate name for it.

But, that'd be a breaking change, and it's not called 'default' so 'auto' is not incorrect and does not need to change. So, I made sure that 'auto' still works as it does now, 'auto.monitor' will do the same thing, and I allow for 'auto_input' (it could be any string really) so that users can still target the default input device, now that the default output device has 'auto'.

I did it this way with the intention of maintaining syntax and familiarity for users, and just generally making my changes as minimal as possible. There are definitely more fancy ways to do this if you like :)

It's only a few lines so I'll fire through a PR, I hope that's OK.

Hi @pallaswept,

This really clears things up, thanks a lot for this!

I always found that the pipewire source selection worked in strange ways, but I never really had time to completely wrap my head around it.

I will review this in more detail in some days when I have more time.

looks good, should there also be added note about this here:

# For pipewire 'source' will be the object name or object.serial of the device to capture from.

Hi @karlstav
Sorry it took me a while to get back to you there. I just wanted to make sure I hadn't missed something, because I was seeing some odd behaviour. I'm confident the above is good now, but there are still a couple of problems on the pipewire front, and I feel like it might be a bit beyond my paygrade to tinker with them, at least before I discuss it with you. I'll start with TL;DR because this reply is a bit verbose, sorry.

TL;DR (in order of certainty):

  1. The above is good to go, and I will add that note to the example config
  2. It would be nice to round the buffer size up to the nearest power of two
  3. Maybe also add node.virtual to the properties so it doesn't trigger recording notifications? ( #657 )
  4. node.always_process breaks stuff, do we want to fix that here? ( #556 )

So the first issue is pretty simple: the latency is set to 10ms, which at pipewire's default 48Khz works out to 480 samples, which has the node request a quantum of 480, but pipewire (at least by default) rounds down to the nearest power of 2, so cava ends up demanding a quantum of 256 samples / 5.3ms. Weaker PC's aren't likely to handle that and I suspect this issue is just that.

The other thing is a bit more... involved ๐Ÿ˜†

This kinda revolves around #556 . Pipewire allows us to set nodes to 'passive', which lets them transition into idle and suspended states when there are no 'active' nodes in the graph. That's really useful for things like filters that we don't really want running and taking up CPU when they're not doing anything. The problem with setting the cava node to 'always process', is that now it can't idle, which means nothing in the graph it's linked to can, either. It keeps the whole audio graph ticking over just to feed cava zeros.

This problem is exacerbated by the buffer size thing mentioned above, and in my use-case (that of the kurve KDE widget) where cava is started automatically at login. It's causing xruns because it's waking up a bunch of stuff right when the daemon has just started and the entire DE is starting. Now I'm sure you're thinking "well don't do that then? :D " and I do agree, but still, it would be nice to fix it, since it has other side-effects like power use, system performance (cores not sleeping), etc.

I did take a shot at fixing that by setting the node passive, and it does work... but, as you know, now pressing 'q' doesn't work, and the bars don't update. With the passive node I can now detect when the incoming streams are paused, and I tried using the reset function as per the fifo input (discussed in #557), but it didn't work (surely I did it wrong), and it wouldn't have solved the unresponsive 'q' key anyway, so it felt like I was just taking the wrong approach there.

cava being a passive node makes a lot of sense when cava's source is (the monitor ports of) an output device, because the app that makes the sound that we'd be seeing in cava, will activate the device, which will activate cava, so cava won't miss anything. When cava's source is an input device however, cava being passive is either really useful or really bad. Basically, if cava is passive, with its source a microphone, and nothing (aside from cava) is recording from the mic, then cava won't hear anything, because cava isn't active, so nor is the device. When an app starts to record from the mic, then it will activate the device and that will activate cava and cava will work as usual.

So, if you want cava to visualise "what am I recording", and save resources if you aren't recording, then passive is nice. If you want cava to visualise "what is the sound in the room even if I'm not recording it" then it's bad.
But if you want cava to visualise "what am I playing" then passive is basically always better. Except it kinda breaks cava. D'oh!
So it seems like, even if there was a fix for the frozen output, we kind could do with a way to toggle this... Theoretically a user could set it outside of cava using rules or the like, but it can't be enabled at all because cava is set to 'always process'.

So I feel like that is a rabbit hole I fell into and probably a thing for another PR, another issue; but if you like, we could work something out and roll it in here? I'd obviously keen to help out but I also don't really know cava well and I do know there are always hidden traps in unknown projects :)

The other things I'm wondering about are changing the node.virtual property and the buffer size thing, which seem like maybe they'd be suited to add on to this? I wonder if node.virtual might be something people would prefer in the config, maybe people will want the notification/appearance in volume controls/etc.

Cheers, and my apologies. I didn't mean to get caught up in that and drag you into it ๐Ÿ˜† I was on holidays and I liked the blinkenlights so I tried to help with something simple, and now look what I've done ๐Ÿคฆ

hi @pallaswept

This is all great input! I am sure we can get something out of it.

  1. I could not find any documentation on the buffer thing. I checked with a debugger and it actually reads 470 n_samples per "on_process", with 44100 sample rate. If I up the requested delay to 512/44100 it goes to 900 something, a bit much maybe...?

  2. I tried setting the PW_KEY_NODE_VIRTUAL in the same manner as the PW_KEY_NODE_ALWAYS_PROCESS is set, but it still shows the recording notification icon.

  3. key controls not working is really just exiting or reloading not working since we are not able to tell pipewire to quit. we set autio.terminate, but it is only read within the on_process function. So when always_process is not on and nothing is happening it can't quit. This means that the main thread is waiting for the audio thread to join, but it never happens. Not sure how to solve it gracefully. There has to be some other event we can use. Something to trigger somehow.

Before I go on: Fear not, the above is still good! And there's a TLDR summary down the end you might like to skip to there :) I'll make a new branch and commit the below stuff if you're interested. (happy to share a patch file or binary)

Well, this gets rabbit-hole of the year award 2025 :D There turned out to be a few inter-related issues with inter-related fixes. There was a lot more going on here than I expected and I feel a little rude bothering you with all these papercuts out of the blue. I did my best to fix it all with minimal impact but it just didn't want to happen, so today I decided to just try and do the best job of it I could, and avoid future problems, even if the changes now were a bit more significant. I won't be offended if you'd rather not take these on board but I didn't want to leave it unfinished so here 'tis!

2/
I'm looking for some docs for reference... This is probably the best summary: https://gitlab.freedesktop.org/pipewire/pipewire/-/wikis/Config-PipeWire#setting-sample-rates
There's the daemon config page which mentions the 48k default and the power-of-two quantum https://docs.pipewire.org/page_man_pipewire_conf_5.html
I came across this one which might be relevant too https://docs.pipewire.org/page_scheduling.html

Basically, pw will take whatever interval is requested, scale it according to the graph's sample rate, and round down to the nearest power of two so that it will service that buffer at least often enough to fill it at the requested rate.

pw-top is a nice window into this (likewise coppwr if you prefer a GUI, it's also useful as you can edit some node properties)
Some examples: (there's nothing cool behind the blur it's just for relevance)

Here's my system with my browser running a discord tab and playing a video (with audio for this test):
Related: the '1' in the 'ERR' column is from cava starting at boot.

Image

firefox is asking for 3600 and 900, since 900 is least, it tries for that, and rounds down to the nearest power of two, so the graph runs at the quantum shown on the driver (my soundcard, top row)

Here's a build with the 10ms buffer:
at 44k1
Image
at 48k
Image
Note that the graph and driver are now running at 256/48000 = 5.3ms which is a lot tighter timing than the 10ms requested
And here's a build with the buffer set to next_power_of_2((512 * data.cava_audio->rate / 48000 )); (basically, it's set to 10.66'ms instead of 10ms)
With that formula (512 * data.cava_audio->rate / 48000 ) we get 470 samples at 44k1 so that seems about right, but it was just a thing I tried and it gives a nice example of cava's effect on the polling rate of the server.
At 44k1:
Image
At 48k:
Image

So that 0.6ms is really costly as though it were 5ms... but your insights with the debugger are super interesting! I'm wondering little details now like does it do 471 every third time :) There are ways to force the graph's rate and quantum , perhaps that might be nice to check with the debugger.

If we wanted to be really fancy I'm pretty sure it's possible to get the graph sample rate and then we could scale the denominator to ensure that cavacore is always fed powers of two. I saw you mention that in another issue but I wasn't sure you wanted to do it yet. cava seems to run well anyway :) I also saw you mention that the rate here determines cava's refresh rate, so maybe you'd prefer to align to that? Anyway it seems to work as is, so I haven't touched that, aside from making sure that pw doesn't jump down a quantum.

That 470 had me scratching my head for a moment, then I punched it into a calculator and realised I was looking at a 10.667ms interval with rounding, and 10.6ms is 512/48k. I'm guessing that it's scaled the requested quantum: 512/48000*44100=470.4. 1024 would be 940.8. The numbers line up but I'm unsure how it got there, I guess it depends on the graph rate and quantum on your machine at the time. Is n_samples 'one channel per sample'? so stereo means n_samples/2?

A related thing I wasn't going to trouble you with but since channel count has come up - if cava's source is set to a surround node, it will become a surround node. The FL and FR channels are processed as expected and the others discarded. At the time, it's also using the buffer sized for 2 channels, and filling it from 6.
Late edit: I ended up fixing the maximum channels to two and adding a config option for remixing (set by default).

3/
Strange about that property, it's working here. I have pw_properties_set(props, PW_KEY_NODE_VIRTUAL, "true"); set right below where PW_KEY_NODE_ALWAYS_PROCESS is set. If I remove it and run cava, I get the icon. I wonder if we have different KDE versions? I'm on 6.4.3 here. It's quite possible there's some other setting in my config that's combining with that property, I'll try it in a clean VM if I can't figure it out.
Later edit: It did work in a VM with stock config for me. I ended up adding a config option for this.

4/
As far as I can tell there's no built-in mechanism like a callback when the variable changes, but there may a loop that can run at idle (there's no documentation, but a method with a suggestive name exists).

Would polling the variable with a timer suffice? I managed to get this working. It's pretty basic but functions as you'd expect. I'm not sure what interval would be suitable, but for a test I fired it at 60fps to clear the graph and then slowed it to 250ms to poll for .terminate.

The bars were still frozen. Per your suggestion in the other issue, I tried reset_output_buffers() but it didn't seem to work, is there a good way to do that? I fed zeros to write_to_cava_input_buffers() and it did the trick but I couldn't make it work like the fifo input and I'm unsure why.

TL;DR

So anyway, the summary now is:

  • monitor node as source - fixed by using pulse syntax (.monitor suffix) (merged).
  • recording notification - fixed by node.virtual (optionally disable with virtual = 0). (tested in VM with stock config)
  • Graph can't sleep - fixed with passive node (optionally engage 'always process' with active = 1 option)
  • quantum rounded down from 10 to 5.3ms - fixed by rounding up to nearest ^2 (test with debugger to ensure not too slow?).
  • 'lost' surround channels - fixed by channel count limit with remix = 0 option to keep the old behaviour.
  • xruns at boot - fixed by passive mode, quantum adjustments and channel count limit.
  • q key not responding / bars not updating - fixed with passive node and timer when idle. is there a better way to clear the output?
  • plus - set node.monitor to flag the server to use a more efficient converter

I've tested this and found a working config for each combination of sources, sinks, stream capture monitors, and playback streams; virtual and real; mono and stereo, 1 and 2 channel (not the same, that's fun), and 5.1, with the surround capturing only the front L/R channels or being mixed down to stereo so all 6 channels show in cava; drivers, followers; passive and active, as the source or for cava itself; starting idle, starting active, cycling three times each... I think that's it? TOOO many combinations :D

As you can see these all kinda meshed in together at the middle. Sorry it took me so long, I had to RTFM an awful lot ๐Ÿ˜† All of these things come direct from pipewire's docs and example apps, it just took some figuring out.

I hope all this is helpful. I've read a lot more pipewire docs and examples than I planned to, but cava is really nice to work on - it takes 11 seconds to compile and testing it is always pretty ๐Ÿคฉ

master...pallaswept:cava:pipewire

That might need some touching up.

But pw-top is happy and free of xruns :)

Image

And lower CPU usage ๐ŸŽ‰

this all looks like solid work, If I had more time on my hands I would have happily dug into it myself as well.

If you could make the new changes opt in (for starters), I will consider merging them in. I don't want to make too many changes for people (things are bound to blow up for some).

btw I use gnome (regular ubuntu 25.04), here is my recording notification:

Image

still there even with pw_properties_set(props, PW_KEY_NODE_VIRTUAL, "true");

not critical to me, maybe it just behaves differently

If you could make the new changes opt in

Good call, I'll get on it. edit: filed #670

If I had more time on my hands I would have happily dug into it myself as well.

Yeh that's what I was thinking while doing this - it's not difficult, just time-consuming, the kind of thing a person doesn't usually have free time for... but I was on a holiday and cava is nice :)

here is my recording notification:

Ahh a perfect example - searching the source code of an app you don't use for a thing you don't know what it's called... not difficult, just time consuming :D
I think that this is the relevant code: https://github.com/GNOME/gnome-shell/blob/63bf8674ff4a6c480da2ff03ff600094128f1ae7/js/ui/status/volume.js#L401-L412

I'm reminded of what I said in #657 :

I figure either there's a hardcoded whitelist of apps, which seems very brittle (KDE devs would be lining themselves up for endless "please add my new app to the whitelist" requests), or there's some kind of property that plasma uses...

Well apparently I think like a KDE dev because gnome has a hardcoded list hahaha ... I don't think we can fix that from this side :(

A fun post just for good vibes.

I noticed one of my cava instances using about half the CPU time as the other, just in normal usage, so I grabbed some performance numbers on this today just for fun.

I have two cava instances that start at boot, one recording my mic, and the other my speakers. I tested with audio being captured and played back, and with no audio capture/playback, each with both cava set to active=1 or both active=0. The system was otherwise idle at the desktop. Each example I ran for 5 minutes after a reboot before grabbing these numbers. Times are measured in jiffies, the CPU is a 5900X in balanced EPP mode, with a noctua D15.

These aren't super solid stats taken from a zillion samples (six, exactly :D) but just something representative of what this is about. The difference is, as expected, not gigantic, but it is significant, and really stacks up if the system runs cava full-time but doesn't play audio full-time:

### Running                                         # Same for Active (active=1) or Passive (active=0) cava

pipewire CPU ms/s:             19.654088            # mic is being recorded and played back via speakers, pipewire is running the graphs
pulse CPU ms/s:                 2.955974            # audio via pulse via firefox https://mozilla.github.io/webrtc-landing/gum_test.html
cava mic CPU ms/s:              9.041533            # cava is awake and being gorgeous
cava speakers CPU ms/s:         9.265175            # as usual

CPU Usage Total:               4.0916770 %
CPU Temp:                      48.1

### Active and Idle

pipewire CPU ms/s:             24.842767            # There is no media playing or being recorded, aside from cava, but pipewire is running for cava
pulse CPU ms/s:                  .094637            # pulse is idle because cava is using pipewire natively
cava mic CPU ms/s:              9.041533            # The microphone never is totally silent, so this cava can not sleep
cava speakers CPU ms/s:         1.661341            # Cava is clever and since this one is being actively fed zeros, it sleeps

CPU Usage Total:               3.5640278 %
CPU Temp:                      49.2                 # In spite of near-halved cava usage, CPU temp barely changes, obviously cava is not generating most of the heat

### Passive and Idle

pipewire CPU ms/s:               .396341            # pipewire is idle/suspended
pulse CPU ms/s:                  .091743
cava mic CPU ms/s:              2.383900            # cava is sleeping, but does have an idle loop so it can catch user inputs/resizes/config changes/signals/etc while pipewire sleeps
cava speakers CPU ms/s:         2.414860

CPU Usage Total:               0.5286844 %
CPU Temp:                      42.9                 # CPU spends more time in lower clocks so it runs cooler

So for ballpark figures, when the system isn't otherwise playing or recording audio, active=0 saves about 90% of the total CPU time of cava and pipewire, and it keeps the CPU about 5C colder, and draws about 4W less (about 10% less), on this machine. When pipewire is busy, but cava is not involved, it will save about 85% per cava instance.

So, you know, it's not like WOW OMG, but it's nice to have and I think it feels satisfying to be able to measure the rewards of your hard work and confirm that it really did have the desired effect :)

Pinging @luisbocanegra as he also worked with us on the new pipewire stuff.

Have a great weekend you two!