Multi-channel feature extraction: when to show channel information?
frankenjoe opened this issue · 4 comments
When processing multi-channel signals, we currently do not show channel information if a single channel is selected, e.g.:
nterface_multi_channel = audinterface.Feature(
['mean', 'std'],
process_func=features,
process_func_is_mono=True,
channels=[0, 1],
)
interface_multi_channel.process_signal(
signal_multi_channel,
sampling_rate,
)
0 1
mean std mean std
start end
0 days 0 days 00:00:01.898250 -0.000311 0.082317 0.000311 0.082317
but:
interface_multi_channel = audinterface.Feature(
['mean', 'std'],
process_func=features,
process_func_is_mono=True,
channels=[1],
)
interface_multi_channel.process_signal(
signal_multi_channel,
sampling_rate,
)
mean std
start end
0 days 0 days 00:00:01.898250 0.000311 0.082317
or:
interface_multi_channel = audinterface.Feature(
['mean', 'std'],
process_func=features,
process_func_is_mono=True,
channels=1,
)
interface_multi_channel.process_signal(
signal_multi_channel,
sampling_rate,
)
mean std
start end
0 days 0 days 00:00:01.898250 0.000311 0.082317
In the latter two cases, we lose the information from which channel the features were extracted. So maybe we should at least keep it in the [1]
case and instead output:
1
mean std
start end
0 days 0 days 00:00:01.898250 0.000311 0.082317
If you use Feature
to create a feature extractor the used channel is stored in the settings, so we will not loose any information.
By returning always the selected channel in the dataframe independent of the number of selected channels we would make it more complicated to work with the results and would completely break backward-compatibility, which I would say we can no longer afford.
If you use Feature to create a feature extractor the used channel is stored in the settings, so we will not loose any information.
It requires you have access to the code that generated the features. If for instance we store features in a audformat
this information would be lost.
By returning always the selected channel in the dataframe independent of the number of selected channels we would make it more complicated to work with the results and would completely break backward-compatibility, which I would say we can no longer afford.
That's why I suggested to differentiate between 1
and [1]
and only include the channel number in the latter case. At least that way the user has the option to encode the channel number.
Even if you do it only for [1]
it will break tons of existing scripts.
What it would not solve is the case where the user combines channels
with mixdown
. So it's anyway up to the user to store the information which channel(s) where selected and possibly combined. Let's stick with the current solution of only showing it if two or more channels are returned.