10ms framesize limit and noise in denoised audio stream
demeng opened this issue · 3 comments
We have a question on the limitation on the framesize to 10ms. We need to process audio stream with smaller size. Is the limitation from the WebRTC audio processing or Python wrapper? Is there a way to walk around this limitation?
In addition, when doing noise suppression on short audio stream buffer (e.g. 50ms) of a long audio, the denoised original audio have beat-like noise at the boundaries of the audio stream buffer. What causes the noise?
It's a requirement of webrtc audio processing.
Not sure about the second question. Does the audio become a discontinuous signal?
Not sure about the second question. Does the audio become a discontinuous signal?
Xiongyi, thanks for the response. Here are two examples of audios processed by using different buffer sizes (chopped the original audio into small buffers, processed each buffer in order, then merged them all together)
This is the audio using buffer of 50ms.
To compare, this is the audio using buffer of 2s.
The audio with 50ms become very discontinuous, and it seems that there are artifacts introduced between boundaries of buffers. In fact, in the audio of buffer 2s, a (mild) discontinuity can be heard every 2 second. But it is much more noticeable when buffer is much smaller as it is much more frequent.
Any ways to solve this issue so that the noise suppressed audios sound the same regardless of buffer size? Any insight is appreciated!