Train a model for a new language
Opened this issue · 7 comments
I want to train a new programming language with a model. Without fine-tuning, it is completely impossible to output because it is an internal front-end framework and the open-source model does not have corresponding corpus. Now, I want to fine tune based on Qwen-2.5-Coder-32B and hope that the output component code can comply with the specifications in the internal framework documentation. And implement code writing. May I ask how to use Qwen-2.5-Coder-32B for training, Do we need to pretrain, or just fine tune based on Qwen-2.5-Coder-32B
https://github.com/QwenLM/Qwen2.5-Coder/tree/main/finetuning
here are our finetuning scripts, you can try.
pretraining or not depends on your demands and resources. We advise you to try first. Hoping to hear your successful implementation on Qwen-Coder :)
Thank you for your reply. If I want to try fine tune Qwen-Coder, can I do it in two steps? The first step is to learn the basic grammar knowledge of the new language, first do grammar knowledge fine-tuning , and then second do instruction fine-tuning,Could you give me some training suggestions about this .thanks
You can try the low-quality data in the first stage and high-quality data in the second sft stage. Maybe, it brings more improvement (https://arxiv.org/abs/2412.05210).
Okay, thank you for your suggestion. I also have a question to ask. Should we use full parameter fine-tuning or based on Lora fine-tuning? Currently, GPU resources are not very sufficient, and I plan to use Lora fine-tuning. I'm not sure about the performance.
both way is ok, i am not sure too :(
waiting for your feedback~
OK,Thank you. If I do pre-training, is SFT's finetune script suitable for pre-training? I see that the source code only provides the finetune script. Can this script be used for pre-training?
You need to modify the script yourself, such as turning off the ChatML format, packing the corpus, and so on.