bastibe/SoundCard

Problems with the "fuzzy matcher" in get_microphone()

Opened this issue · 1 comments

The "fuzzy matcher" of soundcard names in get_microphone() has two problems (observed in Ubuntu 20.04, but they might exist for WIndows and MacOS too):

  1. When get_microphone() is called with include_loopback=True, it can be impossible to get a non-loopback device. Example:
>>> soundcard.all_microphones(include_loopback=True)
[<Loopback Monitor of Eingebautes Tongerät Analog Stereo (2 channels)>, 
 <Microphone Eingebautes Tongerät Analog Stereo (2 channels)>,
 <Loopback Monitor of PCM2702 16-bit stereo audio DAC Analog Stereo (2 channels)>,
 <Loopback Monitor of Eingebautes Tongerät Digital Stereo (HDMI 2) (2 channels)>]
>>> soundcard.get_microphone('Eingebautes Tongerät Analog Stereo', include_loopback=True)
<Loopback Monitor of Eingebautes Tongerät Analog Stereo (2 channels)>

So we have a real input, "Eingebautes Tongerät Analog Stereo" and a loopback device with the same name prefixed by "Monitor ". Which of these devices is returned by _match_soundcard(), depends on the order in which soundcards_by_name.items() returns the devices and their names in line 396 of pulseaudio.py. And this order is unfortunately arbitrary. The order even changes in the example above when the PCM2702 (a USB audio adapter) is not connected: In that case, soundcard.get_microphone('Eingebautes Tongerät Analog Stereo', include_loopback=True) returns the microphone, not the monitor.

I fixed this by adding another lookup: If the call parameter id appears in the dictionary soundcards_by_name, the related device is returned.

+++ b/soundcard/pulseaudio.py
@@ -392,12 +392,18 @@ def _match_soundcard(id, soundcards, include_loopback=False):
         soundcards_by_name = {soundcard['name']: soundcard for soundcard in soundcards}
     if id in soundcards_by_id:
         return soundcards_by_id[id]
+    if id in soundcards_by_name:
+        return soundcards_by_name[id]

  1. A call like soundcard.get_microphone('Crazy device (name') lets the call of re.match(pattern, name) (line 402 in pulseaudio.py) crash, caused by the unbalanced parenthesis. My fix:
-    pattern = '.*'.join(id)
+    id_parts = list(id)
+    for special_re_char in r'.^$*+?{}\[]|()':
+        while special_re_char in id_parts:
+            id_parts[id_parts.index(special_re_char)] = '\\' + special_re_char
+    pattern = '.*'.join(id_parts)

(There may be a more elegant way to implement the replacement of '{' with '\}' etc than two nested loops...)

For my personal needs I simply hacked pulseaudio.py. After reading coreaudio.py and mediafoundation.py I assume that these problems with the device lookup exist also under Windows and macOS, so a proper fix should address these modules too.

I can provide a pull request with a fix for all three modules – but I have neither a Mac nor a Windows installation, so I cannot properly test for any errors I could introduce there... (Yeah, this is a quite simple change, but I have already been bitten by trivial errors in code that I could not test due to lack of hardware or other circumstances... In other words: Such a pull request would need testing by somebody else.)

That's an interesting observation!

Perhaps the easiest solution would be to sort loopback devices after native devices in the internal search loop. That way it would always first match the native device, before finding a loopback.

Of course there's also always the option to use all_microphones, and select the device in question manually, so thankfully it's just the convenience method get_microphone, that is affected. Additionally, you can use get_microphone to select a device by ID, although there's currently no documented way of getting the ID, IIRC.

If you'd like to contribute a fix as a pull request, I'll be happy to help test it!