jameslyons/python_speech_features

Question about length of mfcc output array

pkolb opened this issue · 2 comments

pkolb commented

I'm a little confused with the length of the mfcc output array. The following code

from python_speech_features import mfcc
import scipy.io.wavfile as wav
(rate,sig) = wav.read('test.wav')
mfcc_feat = mfcc(sig,rate)
print("rate="+str(rate))
print("sig.size="+str(sig.size))
print("mfcc_feat.shape="+str(mfcc_feat.shape))

produces:

rate=16000
sig.size=1760
mfcc_feat.shape=(10, 13)

I was expecting a shape of (11, 13), since the audio length is 110ms (160 frames per 10ms), which should result in 11 steps with 10ms each, or shouldn't it?
(If I append some more frames I'll get 11 steps starting from 1841 frames, while sig.size=1840 still gives 10 steps.)

pkolb commented

Yes, thanks for the explanation!