adamstark/Gist

MFCC

JunGenius opened this issue · 2 comments

Hello,author。

I would like to ask you a few questions?

1.Is it possible to get mfcc for only one frame of audio, for example, if I input audio for 10s and 100s respectively, the result is a 13-dimensional mfcc.

2.When I get the mfcc feature, the first value is large. Excuse me, is the parameter I passed incorrect? (I use KISS_FFT)
`

extern "C" MY_MFCC void _stdcall GETMFCC(float *audioFrame, int frameSize, int sampleRate) {

  Gist<float> gist(frameSize, sampleRate);

  gist.processAudioFrame(audioFrame , frameSize);

  const std::vector<float>& mfcc = gist.getMelFrequencyCepstralCoefficients();

  std::cout << "size: " << mfcc.size() << std::endl;

 }

`
Thank you very much and look forward to your answer.

@JunGenius I also faced the problem. The Mel Frequency Spectrum 13-dimension and the values are zeros.

Hi guys, if you are getting zeros in the MFCC, I think the most likely cause is that you aren't setting up the FFT correctly and no FFT is being calculated. Can you check the flags / paths etc to make sure they are correct?

On the question from @JunGenius on the single frame of audio for a MFCC - the MFCC class requires the magnitude spectrum of the FFT as an input. This could be calculated on a very short audio frame or a very long section of audio, as long as it results in a single magnitude spectrum.

I hope that makes sense,

Adam