w-okada/voice-changer

[REQUEST]: Can you please add rmvpe again?

4i0i062 opened this issue · 6 comments

In a few words, describe your idea

Can you please add rmvpe again?

More information

Can you please add rmvpe again?

rmvpe_onnx doesn't work properly please can you add rmvpe again in the new version?

I tried it in 2.0.36, 2.0.40, and 2.0.58, but I'm still using version 1.5.3.18a

because rmvpe_onnx doesn't work properly.

Sincerely, please

In your case, what's the difference that you experience?

The biggest difference between rmvpe and rmvpe_onnx is accuracy and breaking.
Rmvpe is much more stable when it comes to accuracy and breaking.
Rmvpe_onnx is not good enough when it comes to transformation

Some of the old versions that you mentioned were broken with ONNX files, so consider the type of model too. Did you try with a PTH voice model?

In my case, I also feel like rmvpe works slightly better than rmvpe_onnx. But I am not sure if the difference is in the F0 detection method or the software itself. I also keep using v1 over v2 for now for that reason. Also because v1 barely uses any CPU while v2 uses double the amount of CPU than v1 (though, this got way better lately). In 2.0.58, both, performance and conversion quality are very close to v1 so I think we are close to achieve the same quality as v1 but with improvements (hopefully, otherwise v2 wouldn't make sense lol). V2 is still in development after all.
I would like to have rmvpe back too and see if the quality is the same as v1. I also wonder if the F0 detection method quality varies between models. Maybe some models may work better with one method than other. Or maybe also depends on the person's voice and language.

In the latest version v.2.0.60-alpha, we have added support for Torch's RMVPE.
Please give it a try.

Thank you for your continued cooperation.

image

Thank you so much!

I need free time to test but I would like to know @4i0i062 opinion about it. Does this one works for you as good as in v1?

Does this one works for you as good as in v1?

I think so. I did not change the way to process.