huggingface/peft

Support merge_and_unload for IA3 Adapters with 4-bit and 8bit Quantization models

Abdullah-kwl opened this issue · 1 comments

Feature request

Enable merge_and_unload functionality with ia3 adapters loaded with 4-bit and 8-bit quantization model. Currently, merging fails with an error "Cannot merge ia3 layers when the model is loaded in 4-bit mode"
image

Screenshot 2024-05-02 153000

Motivation

Existing merge_and_unload support excludes 4-bit quantized models with ia3 adapters. Merging ia3 adapters into the base model during 4-bit quantization leverages the size reduction of quantization and simplifies deployment by creating a single, smaller model.This feature aligns with the core advantages of IA3(reduced model size) and 4-bit quantization (efficiency gains), enabling users to fully exploit these optimizations.

Your contribution

While I cannot currently submit a pull request, I'm happy to provide further details, test functionalities after implementation, and assist with documentation updates if needed.

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.