huggingface/peft

Add Support for IA3 Adapters in add_weighted_adapter Method, Currently facing issue that 'IA3Model' object has no attribute 'add_weighted_adapter'

Abdullah-kwl opened this issue · 9 comments

Feature request

I propose adding support for IA3 adapters in the add_weighted_adapter method used within the PEFT module of the transformers library. IA3 adapters enhance model adaptability with minimal parameter increase, offering significant benefits for efficient and effective model fine-tuning across various tasks. This feature would allow users to integrate IA3 adapters seamlessly, thereby expanding the functional capabilities of the library.

Motivation

The motivation behind this proposal is to address the current limitation in the PEFT module of the transformers library, where IA3 adapters are supported but the merging of IA3 adapters using the add_weighted_adapter method is not implemented. This gap leads to errors such as 'IA3Model' object has no attribute 'add_weighted_adapter' when attempting to merge IA3 adapters.

The ability to merge IA3 adapters is crucial for users who rely on the PEFT module for efficient model adaptation and fine-tuning. Without this functionality, users are unable to leverage the benefits of IA3 adapters, such as reduced memory footprint and customizable model tuning, within the PEFT architecture.the

Your contribution

I do not have a complete implementation in mind, but I suggest starting with an evaluation of the current add_weighted_adapter function to determine the necessary modifications for supporting IA3 adapters. Collaboration with researchers familiar with IA3 and PEFT could provide insights into feasible approaches.

We had a PR once in #980 but there were a few non-trivial decisions to be made. If you want to work on a PR, please check out the discussion there. Pinging @alexrs also to see if there are any updates.

I am using your branch alexrs:multi-ia3 to test the addd_weighted_adapter for ia3

FIRST_ADAPTER_PATH = "/content/drive/MyDrive/FedMl_test_llm/TrainedModels/WestLake_SFT_IA3_1/Weights/Epoch_1/Job_1"
SECOND_ADAPTER_PATH = "/content/drive/MyDrive/FedMl_test_llm/TrainedModels/WestLake_SFT_IA3_2/Weights/Epoch_1/Job_1"

FIRST_ADAPTER_NAME = "first"
SECOND_ADAPTER_NAME = "second"

model = PeftModel.from_pretrained(quantized_model_4bit, FIRST_ADAPTER_PATH, FIRST_ADAPTER_NAME)
_ = model.load_adapter(SECOND_ADAPTER_PATH, adapter_name=SECOND_ADAPTER_NAME)

adapters_li = [FIRST_ADAPTER_NAME,SECOND_ADAPTER_NAME]
weights_li = [0.5,0.5]

new_adapter= "ADAPTER_WEIGHTED"

if new_adapter in model.peft_config:
model.delete_adapter(new_adapter)

model.add_weighted_adapter(adapters=adapters_li, weights=weights_li, adapter_name=new_adapter)

After this it shows me the error that :
Invalid type <class 'list'> found in target_modules

I meanulay test this using code
loaded_adapters = list(model.peft_config.keys())
print(loaded_adapters)
shows I have two adapters load in model
['first', 'second']

I manually test this:
module_type="target_modules"
adapters=adapters_li
module_types = [type(getattr(model.peft_config[adapter], module_type)) for adapter in adapters]

print(module_types ) shows [list, list]

Screenshot 2024-04-30 181642

Please @alexrs and @BenjaminBossan solve my problem I urgently need to merge few ia3 adapters , how can I add them

Thanks for the ping @BenjaminBossan

Hi @Abdullah-kwl, the code I wrote for #980 is very outdated. The HF folks have been shipping lots of code lately! 👏

I guess this is a good moment to start the conversation again (sorry for the very long delay!). We want to implement add_weighted_adapter for $(IA)^3$ adapters in such a way that is comparable to the LoRA implementation. The implementation, however, is a bit different for multiple reasons:

  • $(IA)^3$ introduces trainable vectors instead of matrices, therefore it does not make sense to support most of the combination types implemented for LoRA.
  • $(IA)^3$ follows multiplicative operators. When we discussed this issue back in November last year, we did not reach an agreement on whether we should do a linear combination of adapters, or multiply them.

I'd suggest that we can start with a simple implementation of add_weighted_adapter that combines adapters using a weighted average of $(IA)^3$ vectors (equivalent to linear in LoRA if I remember correctly).

Thoughts @BenjaminBossan @pacman100 ?

I can try to prototype something in the next few days if you think this is a good approach!

It is good to start with a simpler linear approach.

I'd suggest that we can start with a simple implementation of add_weighted_adapter

That would be fantastic. Let's start with something simple and not try to have a "feature complete" copy of add_weighted_adapter for LoRA.

I can try to prototype something in the next few days if you think this is a good approach!

Thanks, that would be great. Maybe once you have the first testable version, @Abdullah-kwl can test it out and give feedback on whether it works well or not.

Hi @Abdullah-kwl

I found some time to prototype an implementation. You can find it in #1701

It is still work in progress, I did not do any manual testing to check that the result is correct. You can give it a try and report any issues!

Assuming you have two $(IA)^3$ adapters, you should be able to use:

peft_model.add_weighted_adapter(ia3_adapters, [0.1, 0.9], "weighted_adapter")

to combine them using a weighted average of the adapters.

I have tested this, it is working

Screenshot 2024-05-02 171231
Screenshot 2024-05-02 171255

it is working now, the next step could be for svd like strategies

i am also facing some issue relate ia3 that, Cannot merge ia3 layers when the model is loaded in 4-bit

mention in this #1704
image

Screenshot 2024-05-02 153000

@BenjaminBossan please also look at this if we can also add this feature (merge_and_unload for ia3 adapters) for both 4bit and 8bit quantized models .

I have tested this, it is working

Thanks for giving this a spin. If you have any numbers to share, like scores before and after merging, or even code, that would be great.

i am also facing some issue relate ia3 that, Cannot merge ia3 layers when the model is loaded in 4-bit

Indeed, this is not yet supported. We will certainly take a look at this at some point, but contributions are also very welcome. (And please don't post the same issue twice)