On Lora Hyperparameter

Question

On Lora Hyperparameter

DaehanKim opened this issue 2 years ago · 2 comments

Hi, @yxuansu
Thank you for sharing this work!
I'm trying to apply this method to korean model.

https://github.com/yxuansu/PandaGPT/blob/main/code/config/openllama_peft.yaml#L11C1-L14

When I see other repo, it is a coefficient multiplied to lora weights before added to the original weight. In config file, alpha is set to 32 which I think is quite strong. I wonder whether you tried setting alpha=1.

Plus, you set lora_rank = 32, so I am also curious if there's any ablation on this parameter. Thanks!

Answer 1 · 2023-06-16T08:18:18.000Z

Hello, thank you for your attention to our work @DaehanKim . The hyper-parameters of LoRA rely on our experience in this project: OpenAlpaca. We don't conduct a serious ablation study about these hyper-parameters. These hyper-parameters may be suboptimal for the Vicuna model. I think you can change these parameters and do some experiments.

Answer 2 · 2023-06-20T02:30:42.000Z

Thanks I'll take a look. btw alpha=32 seems to work.