amanbasu/speech-emotion-recognition

zero_crosses should be '/320' not '/0.02', shouldn't it?

Closed this issue · 1 comments

In extract_features.py, the following code should be /320 not /0.02, shouldn't it?

zero_crosses = np.nonzero(np.diff(sig[start:end] > 0))[0].shape[0]/0.02 # zero crosses

↓ modify

zero_crosses = np.nonzero(np.diff(sig[start:end] > 0))[0].shape[0]/ 320 # zero crosses

※ Zero Crossing Rate : The rate of sign-changes of the signal during the duration of a particular frame. [1]

ref.[1]: https://github.com/tyiannak/pyAudioAnalysis/wiki/3.-Feature-Extraction

These features are normalized before feeding to the network. So actually, dividing by any constant won't matter a lot.
So using 0.02 or 320 won't make any difference.