modelscope/ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
PythonApache-2.0
Pinned issues
Issues
- 1
We need to unify the data format for MLLM
#2174 opened by YerongLi - 4
qwen2-vl-72b lora微调 ,不支持纯文本指令和图片指令数据混合训练?
#2198 opened by Luccadoremi - 1
Notebook for Continue Pretraining
#2190 opened by msamwelmollel - 5
qwen1.5-A2.7B-moe-chat 训练速度过慢
#2213 opened by cdxzyc - 3
qwenvl2-72b-instruct-awq finetune 报错 [rank7]: NotImplementedError: Cannot copy out of meta tensor; no data!
#2208 opened by ljqnb - 1
Question about auto_find_batch_size.
#2209 opened by lluo-Desktop - 5
强化学习训练MLLM
#2212 opened by AnsongLi - 1
- 1
能否给一个megatron训练的docker镜像?
#2214 opened by jhrsya - 0
swift.llm.inference_client增加vllm、lmdeploy 部署后端的 request_id参数修改功能,提供用户可以追踪的日志
#2161 opened by PancakeAwesome - 0
Audio output is missing from the llama-omni
#2216 opened by satheeshkola-532 - 1
support for Ovis1.6-Gemma2-9B
#2183 opened by betterftr - 6
AttributeError: 'OmniSpeech2SLlamaForCausalLM' object has no attribute '_get_logits_warper'. Did you mean: '_get_logits_processor'?
#2194 opened by satheeshkola-532 - 1
assert num_new_tags >= 0, f'Number of media: {num_media}, number of media_tags: {num_media_tags}'
#2164 opened by dgedanke - 6
qwen2 72B vl部署推理服务报错
#2204 opened by zhangfan-algo - 1
How to change the .cache/modelspace location
#2179 opened by YerongLi - 1
finetuning internvl2 KeyError: 'architectures'
#2180 opened by Li-Jicheng - 1
- 0
Support for Molmo
#2199 opened by betterftr - 1
package installation help: swift with transformers, deepspeed, flash-attention, vllm, etc
#2202 opened by Li-Jicheng - 1
- 0
是否支持图像的在线数据增强
#2168 opened by ninghaiywx - 0
请问MiniCPM V2_6的视频微调如何设置视频的采样帧数?
#2196 opened by mycroft1603 - 0
train two models on the two gpus in one mechine at the same time will cause "address already in use".
#2192 opened by ycwfs - 2
请问支持Mac M系列机器吗
#2163 opened by mobguang - 0
- 5
llama-3.2-3b instruct doesn't stop writing
#2184 opened by Aunali321 - 1
media key for custom dataset is not added to empty_row
#2185 opened by SepehrV - 2
- 1
post_encode for internlm-xcomposer2-7b is incorrect
#2176 opened by YerongLi - 2
About DataCollatorForCompletionOnlyLM(label)
#2187 opened by daje0601 - 0
求助!使用vllm在internvl2评测自制数据集时报错
#2182 opened by NancyFyong - 0
- 1
docker로 사용하고 싶은데 어떻게 사용해야하는지 모르겠어요
#2178 opened by daje0601 - 0
mPlug-Owl3 inference bug report
#2172 opened by goodstudent9 - 0
- 1
- 1
Can we support fp 16 and bf16 in training
#2170 opened by YerongLi - 1
Image scaling issues with DeepSeek-vl-1.3b
#2171 opened by xiang-xiang-zhu - 1
How to SFT with customized datasets
#2150 opened by YerongLi - 5
SFT qwen/Qwen2-VL-7B-Instruct did not pass/process the image embedding for feedforwarding
#2162 opened by YerongLi - 0
speed reduction when fine-tuning Qwen2.5-7B
#2166 opened by YingchaoX - 2
- 8
GOT-OCR自定义数据
#2148 opened by cccusername - 4
官方DEMO不好使?
#2159 opened by BenLampson - 6
mplug-owl3 doing full model sft bug report
#2158 opened by goodstudent9 - 1
qwen2-audio-7b-instruct推理胡说八道
#2156 opened by Liufeiran123 - 1
ValueError: Mixed using with peft is not allowed now
#2152 opened by thesby - 2
TypeError: ChatGLM4Tokenizer._pad() got an unexpected keyword argument 'padding_side'
#2151 opened by xdaiycl - 0
qwen2-vl pretrain supports
#2145 opened by Jintao-Huang