audeering/audinterface

Index Rounding? Error since audinterface 1.0.0

schruefer opened this issue · 14 comments

When running interface on a Multi index, the timestamps are sometimes rounded.

So e.g. instead of the initial index "0 days 0 days 00:00:01.877812" audinterface returns a dataframe with the index "0 days 00:00:01.877812500"

This behavior occurs only since version 1.0.0, the previous version 0.10.2 works fine.


import audb
import os
import audinterface

media = [
    'wav/03a01Fa.wav',
    'wav/03a01Nc.wav',
    'wav/16b10Wb.wav',
    'wav/03a01Wa.wav'
]
db = audb.load(
    'emodb',
    version='1.3.0',
    media=media,
    verbose=False,
)

files = list(db.files)
folder = os.path.dirname(files[0])
df = db['emotion'].get(as_segmented = True, allow_nat=False)
print(df)

def features(signal, sampling_rate):
    return [signal.mean(), signal.std()]

interface = audinterface.Feature(
    ['mean', 'std'],
    process_func=features,
)
df = interface.process_index(df.index)
print(df)

Outputs (for audinterface==1.0.0 and 1.0.1):


                                                                                emotion  emotion.confidence
file                                            start  end                                                  
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Fa.wav 0 days 0 days 00:00:01.898250  happiness                0.90
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Nc.wav 0 days 0 days 00:00:01.611250    neutral                1.00
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Wa.wav 0 days 0 days 00:00:01.877812      anger                0.95
/data/audb/emodb/1.3.0/d3b62a9b/wav/16b10Wb.wav 0 days 0 days 00:00:02.522499      anger                1.00
                                                                                      mean       std
file                                            start  end                                          
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Fa.wav 0 days 0 days 00:00:01.898250    -0.000311  0.082317
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Nc.wav 0 days 0 days 00:00:01.611250    -0.000312  0.125304
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Wa.wav 0 days 0 days 00:00:01.877812500 -0.000296  0.127394
/data/audb/emodb/1.3.0/d3b62a9b/wav/16b10Wb.wav 0 days 0 days 00:00:02.522499999 -0.000464  0.095558

Outputs (for audinterface==0.10.2):

                                                                                 emotion  emotion.confidence
file                                            start  end                                                  
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Fa.wav 0 days 0 days 00:00:01.898250  happiness                0.90
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Nc.wav 0 days 0 days 00:00:01.611250    neutral                1.00
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Wa.wav 0 days 0 days 00:00:01.877812      anger                0.95
/data/audb/emodb/1.3.0/d3b62a9b/wav/16b10Wb.wav 0 days 0 days 00:00:02.522499      anger                1.00
                                                                                   mean       std
file                                            start  end                                       
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Fa.wav 0 days 0 days 00:00:01.898250 -0.000311  0.082317
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Nc.wav 0 days 0 days 00:00:01.611250 -0.000312  0.125304
/data/audb/emodb/1.3.0/d3b62a9b/wav/03a01Wa.wav 0 days 0 days 00:00:01.877812 -0.000296  0.127394
/data/audb/emodb/1.3.0/d3b62a9b/wav/16b10Wb.wav 0 days 0 days 00:00:02.522499 -0.000464  0.095558

Python 3.8 all packages:

audb 1.4.2
audbackend 0.3.18
audeer 1.19.0
audfactory 1.0.12
audformat 0.16.1
audinterface 1.0.1
audiofile 1.2.1
audmath 1.2.1
audobject 0.7.9
audresample 1.2.1
certifi 2022.12.7
cffi 1.15.1
charset-normalizer 3.1.0
dohq-artifactory 0.8.4
filelock 3.10.7
idna 3.4
importlib-metadata 6.1.0
iso-639 0.4.5
iso3166 2.1.1
numpy 1.24.2
oyaml 1.0
pandas 2.0.0
pip 20.0.2
pkg-resources 0.0.0
pycparser 2.21
PyJWT 2.6.0
python-dateutil 2.8.2
pytz 2023.3
PyYAML 6.0
requests 2.28.2
setuptools 44.0.0
six 1.16.0
soundfile 0.12.1
tqdm 4.65.0
tzdata 2023.3
urllib3 1.26.15
zipp 3.15.0

Thanks for reporting, we will try to find out what's going on. As a temporary fix you can use preserve_index=True:

...
df = interface.process_index(df.index, preserve_index=True)
print(df)
file                                              start  end                                       
/media/jwagner/Data/audb/emodb/1.3.0/d3b62a9b/... 0 days 0 days 00:00:01.898250 -0.000311  0.082317
                                                         0 days 00:00:01.611250 -0.000312  0.125304
                                                         0 days 00:00:01.877812 -0.000296  0.127394
                                                         0 days 00:00:02.522499 -0.000464  0.095558

Ok, it's actually an interesting issue. The reason we see a difference between the versions is that pre 1.0.0 we kept the end time from the index and now we overwrite it with the duration we calculate from the number of samples that are processed. Theoretically these values should match of course. Maybe it's because we use the sloppy=True when we calculate the duration in audb or it's some rounding issue when the duration is stored to CSV as part of the dependency table. In any case, the behavior is not nice and we should make sure that we keep the end value from the index.

Or maybe not :)

One advantage of the current implementation is that it returns the correct time if end is out-of-bounds, e.g.:

file = '/media/jwagner/Data/audb/emodb/1.3.0/d3b62a9b/wav/16b10Wb.wav'
interface.process_file(file, end='999999s')

With pre 1.0.0 it returns:

                                                                               mean       std
file                                              start  end                                 
/media/jwagner/Data/audb/emodb/1.3.0/d3b62a9b/... 0 days 11 days 13:46:39 -0.000464  0.095558

But with 1.0.0:

                                                                                        mean       std
file                                              start  end                                          
/media/jwagner/Data/audb/emodb/1.3.0/d3b62a9b/... 0 days 0 days 00:00:02.522499999 -0.000464  0.095558

So I would argue we should keep the new behavior and encourage the user to use preserve_index=True if the index must not change.

@hagenw opinion?

I also think that the current behavior makes sense.

But as an intermediate step we should try to find out at which place exactly we are getting rounding errors. Maybe there is a way to avoid those.

But as an intermediate step we should try to find out at which place exactly we are getting rounding errors. Maybe there is a way to avoid those.

Can it be related to setting sloopy=True when we read the file duration in audb.publish()? Even if we work with WAV files?

soundfile.info(file).duration most likely reads the duration from the header. I don't know if there is a way you can create WAV files that have a duration in the header that does not match the number of samples. But different libraries might round 0.5 differently.

Ok, I think I have found the guilty one:

dur = 2.5225
pd.to_timedelta(dur, 's').total_seconds()
2.522499

There is a workaround proposed in pandas-dev/pandas#46819

>>> pd.to_timedelta(dur, 's') / pd.Timedelta(seconds=1)
2.522499999

I guess to achieve the exact same output we need to use less than nano-second precision:

>>> round(pd.to_timedelta(dur, 's') / pd.Timedelta(seconds=1), ndigits=8)
2.5225

I don't think we can handle this already when doing the pd.to_timedelta(dur, unit='s') conversion, e.g.

>>> pd.to_timedelta(round(dur, ndigits=8), 's')
Timedelta('0 days 00:00:02.522499999')

Looks like we can only do it when converting back to seconds.
Or as an alternative we could check if there is a way to avoid converting to timedelta in the first place.

So I would argue we should keep the new behavior and encourage the user to use preserve_index=True if the index must not change.

Would it be possible to set preserve_index=True by default?
I would assume that the majority of people using process_index would like to keep the index.

We cannot easily do that, since so far we always return a segmented index by default. But with preserve_index=True it can happen that the result is a filewise index (if also the input is a filewise index).

The following workaround seems to work:

>>> pd.to_timedelta(dur * 10 ** 9, 'ns')
Timedelta('0 days 00:00:02.522500')
>>> pd.to_timedelta(dur * 10 ** 9, 'ns').total_seconds()
2.5225