I have a question about using lora for fine-tuning
h0ngc opened this issue · 4 comments
I have trained VITS model now and when I apply LORA to attention layer, fine-tuning is not working properly, could you please tell me which layer you applied to fine-tune VITS model with LORA and what values you used for rank and alpha ?
there has no vits, just a bigvgan, after upsample layers use speaker info to change x with weights and biases.
Thanks for your reply. Can I ask one more thing?
While i'm checking your repo, i noticed that you set the conv_post, activation and speaker_adaptor to be trainable.
As i know, LoRA is something like attaching linear layers to adapt other weights, but your repo seems like fine-tuning part of the model.
Is it some other adaptation of LoRA?
lora is low rank adapter, here is another adapter from micsoft AdaSpeech: Adaptive Text to Speech for Custom Voice or Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers
lora_svc is not real lora, use this name is just want svc developers to think about lora.