在参考run.sh微调cosyvoice 2.0时,发现qwen_pretrain_path是空的
Opened this issue · 3 comments
hixiaoxiong commented
请教一下:
- cosyvoice 2.0近期会出类似老版本examples/magicdata-read/cosyvoice里的微调脚本吗?
- 我在参考老版本的微调脚本时,发现qwen_pretrain_path是空的,会报错,是需要自己下载qwen2.5-0.5B模型吗(如果需要,应该下载哪个版本?有预训练版和instruct版)?能直接用CosyVoice2-0.5B模型下的llm.pt吗?(试过把qwen_pretrain_path指向llm.pt还是报错)
zchoi commented
pretrained_models/CosyVoice2-0.5B/CosyVoice-BlankEN
hixiaoxiong commented
pretrained_models/CosyVoice2-0.5B/CosyVoice-BlankEN
谢谢!加了这个路径之后,examples/magicdata-read/cosyvoice/run.sh里的inference部分能跑通了,但是在跑train llm部分时出现了如下报错:“NotImplementedError: Module [Qwen2LM] is missing the required "forward" function”。我的训练脚本如下所示,基本是直接照搬examples/magicdata-read/cosyvoice/run.sh,不知道是不是哪里配置错了?感谢各位大神帮忙看一下!
export CUDA_VISIBLE_DEVICES="0"
num_gpus=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
job_id=1986
dist_backend="nccl"
num_workers=2
prefetch=100
train_engine=torch_ddp
if [ ${stage} -le 5 ] && [ ${stop_stage} -ge 5 ]; then
echo "Run train. We only support llm traning for now. If your want to train from scratch, please use conf/cosyvoice.fromscratch.yaml"
if [ $train_engine == 'deepspeed' ]; then
echo "Notice deepspeed has its own optimizer config. Modify conf/ds_stage2.json if necessary"
fi
cp data/train/parquet/data.list data/train.data.list
cp data/dev/parquet/data.list data/dev.data.list
for model in llm flow; do
torchrun --nnodes=1 --nproc_per_node=$num_gpus \
--rdzv_id=$job_id --rdzv_backend="c10d" --rdzv_endpoint="localhost:0" \
cosyvoice/bin/train.py \
--train_engine $train_engine \
--config conf/cosyvoice.yaml \
--train_data data/train.data.list \
--cv_data data/dev.data.list \
--model $model \
--checkpoint $pretrained_model_dir/$model.pt \
--model_dir `pwd`/exp/cosyvoice/$model/$train_engine \
--tensorboard_dir `pwd`/tensorboard/cosyvoice/$model/$train_engine \
--ddp.dist_backend $dist_backend \
--num_workers ${num_workers} \
--prefetch ${prefetch} \
--pin_memory \
--deepspeed_config ./conf/ds_stage2.json \
--deepspeed.save_states model+optimizer
done
fi
wjddd commented
pretrained_models/CosyVoice2-0.5B/CosyVoice-BlankEN
谢谢!加了这个路径之后,examples/magicdata-read/cosyvoice/run.sh里的inference部分能跑通了,但是在跑train llm部分时出现了如下报错:“NotImplementedError: Module [Qwen2LM] is missing the required "forward" function”。我的训练脚本如下所示,基本是直接照搬examples/magicdata-read/cosyvoice/run.sh,不知道是不是哪里配置错了?感谢各位大神帮忙看一下!
export CUDA_VISIBLE_DEVICES="0" num_gpus=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}') job_id=1986 dist_backend="nccl" num_workers=2 prefetch=100 train_engine=torch_ddp if [ ${stage} -le 5 ] && [ ${stop_stage} -ge 5 ]; then echo "Run train. We only support llm traning for now. If your want to train from scratch, please use conf/cosyvoice.fromscratch.yaml" if [ $train_engine == 'deepspeed' ]; then echo "Notice deepspeed has its own optimizer config. Modify conf/ds_stage2.json if necessary" fi cp data/train/parquet/data.list data/train.data.list cp data/dev/parquet/data.list data/dev.data.list for model in llm flow; do torchrun --nnodes=1 --nproc_per_node=$num_gpus \ --rdzv_id=$job_id --rdzv_backend="c10d" --rdzv_endpoint="localhost:0" \ cosyvoice/bin/train.py \ --train_engine $train_engine \ --config conf/cosyvoice.yaml \ --train_data data/train.data.list \ --cv_data data/dev.data.list \ --model $model \ --checkpoint $pretrained_model_dir/$model.pt \ --model_dir `pwd`/exp/cosyvoice/$model/$train_engine \ --tensorboard_dir `pwd`/tensorboard/cosyvoice/$model/$train_engine \ --ddp.dist_backend $dist_backend \ --num_workers ${num_workers} \ --prefetch ${prefetch} \ --pin_memory \ --deepspeed_config ./conf/ds_stage2.json \ --deepspeed.save_states model+optimizer done fi
请问您微调成功了吗?