huggingface/notebooks

How to load idefics fine tune model for inference?

imrankh46 opened this issue · 2 comments

Hi, recently I fine tune idefics model with peft. I am not able to load the model.
Is there any way to load the model with peft back for inference?

dvelho commented

Hi. Also facing the same issue. Fine tuned works well until unload the model.

Trained as colab workbook,

code to load the fine tuned model:


device = "cuda" if torch.cuda.is_available() else "cpu"

checkpoint = "HuggingFaceM4/idefics-9b"

bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
llm_int8_skip_modules=["lm_head", "embed_tokens"],
)

processor = AutoProcessor.from_pretrained(checkpoint, use_auth_token=False)
model = IdeficsForVisionText2Text.from_pretrained(checkpoint, quantization_config=bnb_config, device_map="auto")
model = PeftModel.from_pretrained(model, "mrm8488/idefics-9b-ft-describe-diffusion-bf16")
model = model.merge_and_unload()

same issue