hydrallm/llama-moe-v1

Weighted adapter inference script

Opened this issue · 3 comments

sumo43 commented

Using the following code:
https://github.com/kohya-ss/sd-scripts/blob/main/networks/svd_merge_lora.py#L114-L131
we can try an svd merge of the lora adapters, based on aicrumb's method of using cosine distance from centroids for the ratios
For this, we would also need to compute the centroids whenever we cluster a dataset (if we decide to go that route)

Thanks for creating the issue.

It is already implemented in peft lora.py
https://github.com/huggingface/peft/blob/e06d94ddeb6c70913593740618df76908b918d66/src/peft/tuners/lora.py#L476

Here's an example with stable diffusion dreambooth:
https://github.com/huggingface/peft/blob/main/examples/lora_dreambooth/lora_dreambooth_inference.ipynb

def create_weighted_lora_adapter(pipe, adapters, weights, adapter_name="default"):
    pipe.unet.add_weighted_adapter(adapters, weights, adapter_name)
    if isinstance(pipe.text_encoder, PeftModel):
        pipe.text_encoder.add_weighted_adapter(adapters, weights, adapter_name)

    return pipe

Yes, I think both quantization things and merging multiple lora modules are implemented in the PEFT repo. We should maybe go over the PEFT github and figure out all of its functionalities.

sumo43 commented

this is a good idea. the less functionality we need to implement from scratch, the better.