about tango-full-ft-audiocaps

Question

about tango-full-ft-audiocaps

Yusiissy opened this issue a year ago · 1 comments

Hi, thanks for your great open source work!

In your work, I noticed that you used audiocaps dataset to fine tune on tango-full ckpt,
can I know the command for your fine tuning process?
Do I need to modify the learning rate (default=3e-5) and do I use --hf_model or --resume_from_checkpoint in the command?

Looking forward to your reply, thanks again!😊

Answer 1 · 2023-10-21T02:27:26.000Z

Yes, you can use the --hf_model argument to pass the tango-full model checkpoint for doing that. The full command would be:

accelerate launch train.py \
--train_file="data/train_audiocaps.json" --validation_file="data/valid_audiocaps.json" --test_file="data/test_audiocaps_subset.json" \
--hf_model "declare-lab/tango-full" --unet_model_config="configs/diffusion_model_config.json" --freeze_text_encoder \
--gradient_accumulation_steps 4 --per_device_train_batch_size=2 --per_device_eval_batch_size=2 --augment \
--learning_rate=3e-5 --num_train_epochs 40 --snr_gamma 5 \
--text_column captions --audio_column location --checkpointing_steps="best"