magic-research/PLLaVA

finetune problem

AshOneN opened this issue · 3 comments

I'd like to fine-tune on downstream tasks on pllava-7B,I modified config_pllava_nframe.py
1716207674723
and this is my train_pllava.sh

1716207807764
I found that I was able to train normally
1716207994643
i think pllava_video_outputs/test_train_7b_reconstruct/pretrained_epoch09 was my last round of lora and projection layer training weights.So i run bash scripts/demo.sh pllava-7B /public/nijiahui/pllava_video_outputs/test_train_7b_reconstruct/pretrained_epoch09 to test.There are no errors, but the model is not output
1716208286864
1716208305860
Is there any problem with my procedure?

Hi, I encountered the same problems like you, and have no clue so far. Have you solved this problem.

I've discovered two methods to solve the problem.

One approach involves running the script "bash scripts/demo.sh ${model_dir} ${weights_dir}", setting model_dir to llava-hf/llava-v1.6-vicuna-7b-hf and weights_dir to pretrained_epochXX.

Alternatively, you can create a new folder and place the files from ckpt_epochXX, pretrained_epochXX, and pretrained_stepXX into it. In this folder, ensure that the model.safetensors weight files from ckpt_epochXX and pretrained_epochXX have the same name, while retaining the weight from ckpt_epochXX.

Testing both methods with fine-tuned models yielded identical outputs for the same inputs.

model_dir here is the base model's weights that would be loaded upon PllavaForCausualLM.from_pretrained. It would contain the weights of the original Image Language Model.

weight_dir then directs to the weights saved during video training (for 7B model lora, there is only the projector and lora weights trained). This part of model would be loaded again after obtaining the PeftModel with get_peft_model.

At least one of model_dir or weight_dir should contain the image model's weights. Therefore setting model_dir to llava-hf/llava-v1.6-vicuna-7b-hf would be the easiest way to do so.

I've discovered two methods to solve the problem.

One approach involves running the script "bash scripts/demo.sh ${model_dir} ${weights_dir}", setting model_dir to llava-hf/llava-v1.6-vicuna-7b-hf and weights_dir to pretrained_epochXX.

Alternatively, you can create a new folder and place the files from ckpt_epochXX, pretrained_epochXX, and pretrained_stepXX into it. In this folder, ensure that the model.safetensors weight files from ckpt_epochXX and pretrained_epochXX have the same name, while retaining the weight from ckpt_epochXX.

Testing both methods with fine-tuned models yielded identical outputs for the same inputs.

BTW, the code will only load from either a full model (sharded with huggingface format) or a lora weights + projector weights (models.safetensors). It would load the full model with higer priority. So I think this solution would only load in the pllava's weights.

if weight_dir is not None:
state_dict = {}
save_fnames = os.listdir(weight_dir)
if "model.safetensors" in save_fnames:
use_full = False
for fn in save_fnames:
if fn.startswith('model-0'):
use_full=True
break
else:
use_full= True
if not use_full:
print("Loading weight from", weight_dir, "model.safetensors")
with safe_open(f"{weight_dir}/model.safetensors", framework="pt", device="cpu") as f:
for k in f.keys():
state_dict[k] = f.get_tensor(k)
else:
print("Loading weight from", weight_dir)
for fn in save_fnames:
if fn.startswith('model-0'):
with safe_open(f"{weight_dir}/{fn}", framework="pt", device="cpu") as f:
for k in f.keys():
state_dict[k] = f.get_tensor(k)
if 'model' in state_dict.keys():
msg = model.load_state_dict(state_dict['model'], strict=False)
else:
msg = model.load_state_dict(state_dict, strict=False)
print(msg)