modelscope/3D-Speaker

Methods for fine-tuning of pretrained models in modelscope

Closed this issue · 1 comments

Hello, thank you for the wonderful repository! It really helped.
Currently, our team is trying to fine-tune ERes2Net-200k published in modelscope using a large amount of speech data. As I was not able to fine-tune properly, I think that several parameters within the configuration need to be modified for the task. Could you please share those details? If my fine-tuning is successful with good results, I will share the methodologies for the community.

Thank you for acknowledging our research. In general, a suggested approach would be to initially fix the ERes2Net model parameters and train the classifier using your data until convergence. Subsequently, you can use a lower learning rate to jointly train the encoder and classifier. We encourage you to experiment with this approach and welcome further research and discussion on the topic.