On Lora Hyperparameter
DaehanKim opened this issue · 2 comments
Hi, @yxuansu
Thank you for sharing this work!
I'm trying to apply this method to korean model.
https://github.com/yxuansu/PandaGPT/blob/main/code/config/openllama_peft.yaml#L11C1-L14
When I see other repo, it is a coefficient multiplied to lora weights before added to the original weight. In config file, alpha is set to 32 which I think is quite strong. I wonder whether you tried setting alpha=1.
Plus, you set lora_rank = 32, so I am also curious if there's any ablation on this parameter. Thanks!
Hello, thank you for your attention to our work @DaehanKim . The hyper-parameters of LoRA rely on our experience in this project: OpenAlpaca. We don't conduct a serious ablation study about these hyper-parameters. These hyper-parameters may be suboptimal for the Vicuna model. I think you can change these parameters and do some experiments.
Thanks I'll take a look. btw alpha=32 seems to work.