YiyangZhou/POVID

Loading adapter weights from /root/autodl-tmp/POVID/checkpoint/output/POVID_stage_one_LoRa_bs2 led to unexpected keys not found in the model:

Opened this issue · 0 comments

Loading adapter weights from /root/autodl-tmp/POVID/checkpoint/output/POVID_stage_one_LoRa_bs2 led to unexpected keys not found in the model: ['model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.lora_B.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.lora_A.default.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.lora_B.default.weight'].
to bfloat16...
Adding LoRA adapters...
Traceback (most recent call last):
File "/root/autodl-tmp/POVID/llava/train/train_dpo_inherent.py", line 1052, in
train()
File "/root/autodl-tmp/POVID/llava/train/train_dpo_inherent.py", line 902, in train
model = get_peft_model(model, lora_config)
File "/root/miniconda3/envs/POVID/lib/python3.10/site-packages/peft/mapping.py", line 106, in get_peft_model
return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](model, peft_config, adapter_name=adapter_name)
File "/root/miniconda3/envs/POVID/lib/python3.10/site-packages/peft/peft_model.py", line 889, in init
super().init(model, peft_config, adapter_name)
File "/root/miniconda3/envs/POVID/lib/python3.10/site-packages/peft/peft_model.py", line 111, in init
self.base_model = PEFT_TYPE_TO_MODEL_MAPPING[peft_config.peft_type](
File "/root/miniconda3/envs/POVID/lib/python3.10/site-packages/peft/tuners/lora.py", line 274, in init
super().init(model, config, adapter_name)
File "/root/miniconda3/envs/POVID/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 88, in init
self.inject_adapter(self.model, adapter_name)
File "/root/miniconda3/envs/POVID/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 219, in inject_adapter
self._create_and_replace(peft_config, adapter_name, target, target_name, parent, **optionnal_kwargs)
File "/root/miniconda3/envs/POVID/lib/python3.10/site-packages/peft/tuners/lora.py", line 372, in _create_and_replace
new_module = self._create_new_module(lora_config, adapter_name, target, **kwargs)
File "/root/miniconda3/envs/POVID/lib/python3.10/site-packages/peft/tuners/lora.py", line 481, in _create_new_module
raise ValueError(
ValueError: Target module Dropout(p=0.05, inplace=False) is not supported. Currently, only torch.nn.Linear and Conv1D are supported.

When I performed the first stage of training and merged LoRA, the second stage of DPO training reported an error.