modelscope/ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)

PythonApache-2.0

Pinned issues

魔搭NPU训练部署交流群

#1589 opened 4 months ago by Jintao-Huang

Open1

infer, sft, rlhf support for LLama3.2-Vision

#2133 opened 2 months ago by Jintao-Huang

Open3

Fine-tuning best practices for qwen2.5-72b-instruct and qwen2-vl-72b-instruct.

#2064 opened 2 months ago by Jintao-Huang

Open10

Issues

We need to unify the data format for MLLM
#2174 opened a month ago by YerongLi
1
qwen2-vl-72b lora微调，不支持纯文本指令和图片指令数据混合训练？
#2198 opened a month ago by Luccadoremi
4
Notebook for Continue Pretraining
#2190 opened 2 months ago by msamwelmollel
1
qwen1.5-A2.7B-moe-chat 训练速度过慢
#2213 opened 2 months ago by cdxzyc
5
qwenvl2-72b-instruct-awq finetune 报错 [rank7]: NotImplementedError: Cannot copy out of meta tensor; no data!
#2208 opened 2 months ago by ljqnb
3
Question about auto_find_batch_size.
#2209 opened 2 months ago by lluo-Desktop
1
强化学习训练MLLM
#2212 opened 2 months ago by AnsongLi
5
Can't Find How to Apply Inference on Fine-Tuned Qwen2-vl 7B Model
#2200 opened a month ago by gungorturan
1
能否给一个megatron训练的docker镜像？
#2214 opened a month ago by jhrsya
1
swift.llm.inference_client增加vllm、lmdeploy 部署后端的 request_id参数修改功能，提供用户可以追踪的日志
#2161 opened 2 months ago by PancakeAwesome
0
Audio output is missing from the llama-omni
#2216 opened 2 months ago by satheeshkola-532
0
support for Ovis1.6-Gemma2-9B
#2183 opened 2 months ago by betterftr
1
AttributeError: 'OmniSpeech2SLlamaForCausalLM' object has no attribute '_get_logits_warper'. Did you mean: '_get_logits_processor'?
#2194 opened 2 months ago by satheeshkola-532
6
assert num_new_tags >= 0, f'Number of media: {num_media}, number of media_tags: {num_media_tags}'
#2164 opened 2 months ago by dgedanke
1
qwen2 72B vl部署推理服务报错
#2204 opened 2 months ago by zhangfan-algo
6
How to change the .cache/modelspace location
#2179 opened 2 months ago by YerongLi
1
finetuning internvl2 KeyError: 'architectures'
#2180 opened 2 months ago by Li-Jicheng
1
qwen2-vl-7b-instruct vllm inference assert "factor" in rope_scaling
#2188 opened 2 months ago by daje0601
1
Support for Molmo
#2199 opened 2 months ago by betterftr
0
package installation help: swift with transformers, deepspeed, flash-attention, vllm, etc
#2202 opened 2 months ago by Li-Jicheng
1
Finetuned Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4 cannot be loaded with PEFT
#2203 opened 2 months ago by ROIM1998
1
是否支持图像的在线数据增强
#2168 opened 2 months ago by ninghaiywx
0
请问MiniCPM V2_6的视频微调如何设置视频的采样帧数？
#2196 opened 2 months ago by mycroft1603
0
train two models on the two gpus in one mechine at the same time will cause "address already in use".
#2192 opened 2 months ago by ycwfs
0
请问支持Mac M系列机器吗
#2163 opened 2 months ago by mobguang
2
How to implement weight decay towards the pre-trained model?
#2189 opened 2 months ago by sedol1339
0
llama-3.2-3b instruct doesn't stop writing
#2184 opened 2 months ago by Aunali321
5
media key for custom dataset is not added to empty_row
#2185 opened 2 months ago by SepehrV
1
Does it support grounding, OCR, Video hybrid fine-tuning training?
#2186 opened 2 months ago by WangRongsheng
2
post_encode for internlm-xcomposer2-7b is incorrect
#2176 opened 2 months ago by YerongLi
1
About DataCollatorForCompletionOnlyLM(label)
#2187 opened 2 months ago by daje0601
2
求助！使用vllm在internvl2评测自制数据集时报错
#2182 opened 2 months ago by NancyFyong
0
是否可以支持多卡并行的多step inference，以及训练时每一个step online生成训练数据多卡训练模型
#2181 opened 2 months ago by goodstudent9
0
docker로 사용하고 싶은데 어떻게 사용해야하는지 모르겠어요
#2178 opened 2 months ago by daje0601
1
mPlug-Owl3 inference bug report
#2172 opened 2 months ago by goodstudent9
0
qwen2-vl/llama3.2-vl batch_size>1 zero2/zero3 mixed text training error
#2147 opened 2 months ago by Jintao-Huang
0
新版本的kto无效
#2167 opened 2 months ago by rtz1998
1
Can we support fp 16 and bf16 in training
#2170 opened 2 months ago by YerongLi
1
Image scaling issues with DeepSeek-vl-1.3b
#2171 opened 2 months ago by xiang-xiang-zhu
1
How to SFT with customized datasets
#2150 opened 2 months ago by YerongLi
1
SFT qwen/Qwen2-VL-7B-Instruct did not pass/process the image embedding for feedforwarding
#2162 opened 2 months ago by YerongLi
5
speed reduction when fine-tuning Qwen2.5-7B
#2166 opened 2 months ago by YingchaoX
0
Using customized dataset evaluation ends with lookup error
#2153 opened 2 months ago by mianbaoji
2
GOT-OCR自定义数据
#2148 opened 2 months ago by cccusername
8
官方DEMO不好使?
#2159 opened 2 months ago by BenLampson
4
mplug-owl3 doing full model sft bug report
#2158 opened 2 months ago by goodstudent9
6
qwen2-audio-7b-instruct推理胡说八道
#2156 opened 2 months ago by Liufeiran123
1
ValueError: Mixed using with peft is not allowed now
#2152 opened 2 months ago by thesby
1
TypeError: ChatGLM4Tokenizer._pad() got an unexpected keyword argument 'padding_side'
#2151 opened 2 months ago by xdaiycl
2
qwen2-vl pretrain supports
#2145 opened 2 months ago by Jintao-Huang
0