bastibe/PySoundCard

Allow selecting devices by device name substrings

Opened this issue · 12 comments

In addition to numeric device IDs, it should also be possible to select devices by strings, e.g. device="hdmi".

I think it would be practical to use case-insensitive search, probably a space separated list of substring would also be useful, e.g. device="hdmi 1".

We could allow glob patterns and/or regular expressions, but I think that might be overkill.

If a string fits to several devices, an error should be raised.

Substrings and/or globs by default seems a bit magical .. having an two functions - one that can lookup by exact string, and another that iterates through the names is enough that people can build any kind of matching they like on top.

We could provide different selection methods, like

search_device(id, name=None, glob=None, match_substring=False)

would that be un-magical enough?

Makes sense..

I guess, glob, regex and match by string are common enough it could be a useful API.

The usecases here are probably: GUIs (substring match), commandline: regex / glob.

I think having to call a separate function for this doesn't really simplify anything.
I would like to be able to do something like this:

s = Stream(device="system")

I don't think the suggested options (glob, match_substring) fit well into this API.
I think there should be only one way these strings are handled.
We just have to choose which one is the best (simplest/most useful/most flexible/least surprising/...).

For me, the main motivation for this feature is convenience. All this can be done in arbitrary ways in user code already, but I would like to have one convenient built-in way to select a device by string.
IMHO, the main use case would be an interactive session, where the amount of typing should be reduced.

As @bastibe and I discussed offline, the string to compare against should also include the name of the host API.
This would be especially useful on Windows, where the same device name is repeated for each host API (and matches would never be unique if only the device name is taken into account).
@bastibe also suggested fuzzy search, can you please describe this here?

We should probably collect a list of examples here, in order to be able to choose which selection method is best for our goal.

When using JACK, it would be useful to specify device="system", because this is typically the JACK port name for the actual soundcard. This is especially important since JACK doesn't seem to be the default host API (at least on Linux), even if the JACK daemon is running (and blocking the hardware for all other uses).
In this case "system" is the full device name, so no substring nor glob nor anything would be needed.

Another example would be to use ALSA device numbers like this: device="hw:0,0".
The full device name would be, e.g., "HDA Intel PCH: ALC269VB Analog (hw:0,0)".
A simple substring search would take care of this.

Another possibility to select the same example device, would be device="intel analog".
In this case, a case-insensitive search for all substrings in a space-separated list would do.

Please add more examples!

Fuzzy matching means that any search string "fb" is interpreted as ".*f.*b.*", and therefore matches both "facebook" and "foobar". This is very useful for searching for, say, "mic ca", and finding "Internal Microphone (CoreAudio)". Good implementations of this prefer longer substring matches, so that "usb" will prefer "usb microphone" over "usable binary".

Fuzzy matching sounds best indeed ... kind of coming round to this since, commandline and gui apps pysoundcard apps would be using the same kind of matching.

I implemented something like this using the python module 'fuzzywuzzy':

$ pip install fuzzywuzzy

import collections

from pysoundcard import devices
from fuzzywuzzy import fuzz

def fuzzydevices(match=''):
    device_ratios = []
    for device in devices():
        ratio = fuzz.partial_ratio(device['name'], match)
        if ratio > 30:
            device_ratios.append((ratio, device))

    for ratio, device in sorted(device_ratios, key=lambda ratio_device: (-ratio_device[0])):
        yield device



>>> for device in fuzzydevices('loop pc'):
>>>    print device['name']
Loopback: PCM (hw:1,0)
Loopback: PCM (hw:1,1)

It works quite well, I tried some pretty tenuous input and it seems pretty sane - give it a try..

Updated this slightly for some code I'm using so there is an API to get one device, no matter what

def fuzzydevices(match='', min_ratio=30):
    device_ratios = []
    for device in devices():
        ratio = fuzz.partial_ratio(device['name'], match)
        if ratio > min_ratio:
            device_ratios.append((ratio, device))

    for ratio, device in sorted(device_ratios, key=lambda ratio_device: (-ratio_device[0])):
        yield device

def firstfuzzydevice(match=''):
    devices = list(fuzzy_devices(match, 0))
    return devices[0]

I guess this API should allow matching other device parameters too, I'll have a look into this when I get some time ..

That's cool! I'll have a look at fuzzywuzzy!

I'm still not convinced that fuzzy matching provides an actual advantage compared to a plain substring match.
I'm even less convinced that it is worth adding another dependency.

@stuaxo: can you please provide a few concrete examples for search expressions and device strings that you would use?

The only example so far was "mic ca" for matching "Internal Microphone (CoreAudio)", but this would IMHO just as easily be matched by "mic co" using simple substring matching:

devices = [...]  # iterable of strings
expression = "mic co"
matches = []
for d in devices:
    if all(s.lower() in d.lower() for s in expression.split()):
        matches.append(d)
if len(matches) == 1:
    print("I found", repr(matches[0]))
elif len(matches) == 0:
    print("I didn't find anything matching", repr(expression))
else:
    print("I found multiple matches:", matches)

I've only used "loop" so far, to grab a loopback device, I guess this should work with substring matching too.. in the future I'll probably be using line-in, but can't check that as I don't have it on this laptop.

Once my current project is further along I should have more of an idea about this.

I guess being able to select a single device is the useful part of this - additional filtering is something that should live in whatever GUI project is using pysoundcard ?