audeering/audinterface

Speed up feature extraction

frankenjoe opened this issue · 6 comments

When extracting features with Feature we currently rely on Process under the hood, which returns a pd.Series with feature vectors. We then convert these to a list and afterwards call pd.concat(list) to combine them to a single matrix. The last step can take quite long (sometimes as long or longer as the feature extraction itself). We could speed this up if we pre-allocate a matrix beforehand and directly assign the values. At least when not processing with a sliding window this should be possible.

To demonstrate there's quite some room for improvement:

import pandas as pd

import audb
import audinterface
import audiofile


db = audb.load(
    'emodb',
    version='1.3.0',
    format='wav',
    sampling_rate=16000,
    mixdown=True,
)
files = db.files

def process_func(x, sr):
    return [x.mean(), x.std()]

# slow

feature = audinterface.Feature(
    ['mean', 'std'],
    process_func=process_func,
)

t = time.time()
df = feature.process_files(files)
print(time.time() - t)

# fast

t = time.time()
data = np.empty(
    (len(files), 2),
    dtype=np.float32,
)

for idx, file in enumerate(files):
    signal, sampling_rate = audiofile.read(file)
    data[idx, :] = process_func(
        signal,
        sampling_rate,
    )

df_fast = pd.DataFrame(
    data,
    index=df.index,
    columns=df.columns,
)
print(time.time() - t)

pd.testing.assert_frame_equal(df, df_fast)
5.972992181777954
0.17418813705444336

We then convert these to a list

I guess the idea for a solution is to avoid this step?

Yes, especially the concatenation of the DataFrames seems awefully slow. So the idea would be to create a matrix of the expected size (samples x features) and directly assign the extracted features. This is of course only possible if no sliding window is selected as otherwise we cannot know the shape of the final matrix.

After #102, #103, and #104 the above test now returns the following for me:

0.23550820350646973                                                                                 
0.17041683197021484

Can we close here, or is there further room for improvement?

I guess not, the comparison is also not 100% fair as in the second case we rely on the index created by Feature. What is still missing is a sped up of Segment. So we either expand this issue or we create a new one.

I created #106 to track Segment and will close here.