mlpc-ucsd/TokenCompose

compatibility with lora

chmxu opened this issue · 2 comments

Hi, I wonder if you have tried using LoRA to finetune the model? It should need less gpu memory than the current full finetuning strategy.

Hi Chengming,

Yes, we have attempted training with lora but encountered performance degradation with default ranks. Therefore, we did not end up having a lora finetuned version.

That said, it still might be possible to finetune with lora, although many hyperparameters (e.g., rank, types of attention weights) likely need to be carefully experimented specifically with our setup, which makes attention map itself an objective to optimize.

Best,
Zirui

Thank you!