Question about length of mfcc output array

Question

Question about length of mfcc output array

pkolb opened this issue 7 years ago · 2 comments

I'm a little confused with the length of the mfcc output array. The following code

from python_speech_features import mfcc
import scipy.io.wavfile as wav
(rate,sig) = wav.read('test.wav')
mfcc_feat = mfcc(sig,rate)
print("rate="+str(rate))
print("sig.size="+str(sig.size))
print("mfcc_feat.shape="+str(mfcc_feat.shape))

produces:

rate=16000
sig.size=1760
mfcc_feat.shape=(10, 13)

I was expecting a shape of (11, 13), since the audio length is 110ms (160 frames per 10ms), which should result in 11 steps with 10ms each, or shouldn't it?
(If I append some more frames I'll get 11 steps starting from 1841 frames, while sig.size=1840 still gives 10 steps.)

Answer 1 · 2018-06-02T08:17:08.000Z

The frame length is 20ms, so a 20ms signal will be 1 frame. 30ms will be 2 frames (because the shift is 10ms). 40ms will be 3 frames etc. Up to 110ms being 10 frames. If you draw out the frames and the overlaps on some paper it should make sense. Hope this helps!

…

On Sat, 2 Jun 2018, 6:09 PM pkolb ***@***.***> wrote: I'm a little confused with the length of the mfcc output array. The following code from python_speech_features import mfcc import scipy.io.wavfile as wav (rate,sig) = wav.read('test.wav') mfcc_feat = mfcc(sig,rate) print("rate="+str(rate)) print("sig.size="+str(sig.size)) print("mfcc_feat.shape="+str(mfcc_feat.shape)) produces: rate=16000 sig.size=1760 mfcc_feat.shape=(10, 13) I was expecting a shape of (11, 13), since the audio length is 110ms (160 frames per 10ms), which should result in 11 steps with 10ms each, or shouldn't it? (If I append some more frames I'll get 11 steps starting from 1841 frames, while sig.size=1840 still gives 10 steps.) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#66>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABn1QTeUbLQ1IoLIZBUnmStqc298qj9gks5t4khTgaJpZM4UXmV5> .

Answer 2 · 2018-06-02T11:46:57.000Z

Yes, thanks for the explanation!