Sampling Rates (SR) above 16 kHz

Question

Sampling Rates (SR) above 16 kHz

cweaver-logitech opened this issue 2 years ago · 3 comments

Thanks for this nice package. I'm curious if there are any models that have been trained with a sampling rate about 16 kHz?

Exception has occurred: ParameterError
Mono data must have shape (samples,). Received shape=(1, 320000)

Answer 1 · 2022-12-06T05:39:02.000Z

Hi @cweaver-logitech, thanks for using mayavoz.
All the currently available pretrained models are trained at 16Khz. This is due to two reasons

16Khz is the recommended SR in the corresponding model architecture paper
Training on higher SR requires better GPU resources than I have got.

Although this is not a constraint for doing inference with input at any sampling rate. Mayavoz will automatically resample the input to the required model sampling rate.

Can you share some more details about the error?

Answer 2 · 2022-12-06T10:34:55.000Z

Thanks for the quick reply. No need to look any further as I want the input sample rate (44.1 and 48) to be preserved when the inference audio is written.

Answer 3 · 2022-12-06T11:26:54.000Z

Thanks for the clarification @cweaver-logitech . I see you have a point there. Currently when writing output mayavoz uses model sampling rate but if you chose to return the output mayavoz returns the output with input sampling rate.
I have opened another issue here #34