QA-LoRA Implementation and Review
okpatil4u opened this issue · 3 comments
Hello Eric,
I recently came across QA-LoRA paper.
How easy or difficult is it to implement it in the candle-lora repo ?
Hi @okpatil4u, thanks for the question. The answer is that it is easy if QA-LoRA can be implemented - as candle-lora's layer-swapping mechanism would make it easy. Unfortunately, Candle does not provide great quantized support yet (to my knowledge, please let me know otherwise) so implementing QA-LoRA would be difficult.
Currently, I have not implemented QALoRA for candle-lora because it doesn't seem like there is much support for quantized tensors on Candle's side. See my issue regarding that here: huggingface/candle#1006.
Regardless, thank you for your interest. Feel free to submit a PR if you find a way to implement QA-LoRA, although I do plan on implementing it when Candle gets better quantized support.
Yes, we had observed the lack of quantized operations support as well. Should I keep the issue open ?
I think it would be better to close it - I think it is resolved, and I will need to look into the quantization support. Thanks for the question!