ValueError: Target modules ['q_proj', 'v_proj'] not found in the base model. Please check the target modules and try again.
su-heyang opened this issue · 2 comments
su-heyang commented
train_sft.py训练指令:
CUDA_VISIBLE_DEVICES=0 python src/train_sft.py
--model_name_or_path /data1/projects/baichuan-7B/
--do_train
--dataset alpaca_gpt4_zh
--finetuning_type lora
--output_dir output
--overwrite_cache
--per_device_train_batch_size 4
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 1000
--learning_rate 5e-5
--num_train_epochs 3.0
--plot_loss
--fp16
训练报错ValueError: Target modules ['q_proj', 'v_proj'] not found in the base model. Please check the target modules and try again.
有没有大佬知道怎么解决,谢谢!
intothephone commented
加参数:--lora_target W_pack
1ring2rta commented
百川的attention应该是把(Wq,Wk,Wv) concat成一个W_pack了