yxuansu/PandaGPT

On Lora Hyperparameter

DaehanKim opened this issue · 2 comments

Hi, @yxuansu
Thank you for sharing this work!
I'm trying to apply this method to korean model.

https://github.com/yxuansu/PandaGPT/blob/main/code/config/openllama_peft.yaml#L11C1-L14

When I see other repo, it is a coefficient multiplied to lora weights before added to the original weight. In config file, alpha is set to 32 which I think is quite strong. I wonder whether you tried setting alpha=1.

Plus, you set lora_rank = 32, so I am also curious if there's any ablation on this parameter. Thanks!

Hello, thank you for your attention to our work @DaehanKim . The hyper-parameters of LoRA rely on our experience in this project: OpenAlpaca. We don't conduct a serious ablation study about these hyper-parameters. These hyper-parameters may be suboptimal for the Vicuna model. I think you can change these parameters and do some experiments.

Thanks I'll take a look. btw alpha=32 seems to work.