Issues
- 6
Qwen2.5 14B models are ... sometimes? ... having their token vocabulary truncated down to 'actual'?
#425 opened by ann-brown - 3
- 14
Support for new Llama 3.2 - 1B / 3B ?
#424 opened by David-AU-github - 3
KeyError model[0] did not exist in tensor?
#446 opened by FrozzDay - 1
N-model ModelStock merging
#453 opened by vishaal27 - 13
[request] Support for Vision Language Models
#434 opened by NickGao96 - 0
Rewrite readme more novice-friendly
#462 opened by clover1980 - 0
mergekit for vision models
#461 opened by prince0310 - 1
Why are the names of parameters hard-coded? Is it possible to read it from index.json in HF checkpoints?
#460 opened by zhangzx-uiuc - 4
Is there a way to run LORA extraction using multi GPU? 70B LORA extraction OOM on 24GB 3090Ti
#393 opened by Nero10578 - 1
- 1
Critical Merging Bug just started...
#457 opened by David-AU-github - 0
About Model-Breadcrumbs merge implementation
#455 opened by vishaal27 - 0
Base Model generation time increases when passed through the MergeKit
#454 opened by ahmedamrelhefnawy - 2
Moe merging failed
#452 opened by PsoriasiIR - 2
- 0
[question] multi gpu available?
#449 opened by eunbin079 - 2
[question] `task_arithmetic` simple question
#438 opened by eunbin079 - 3
- 2
[request]Can it support architectures such as stable diffusion Xl and flux dev?
#433 opened by win10ogod - 1
- 0
11
#439 opened by meiyiyeshi - 0
After the two Qwen1.5-7B-chat models were merged, garbled inference results appeared.
#437 opened by Zhangfanfan0101 - 3
Broken tokenizer in Yi-34B merge
#428 opened by Asherathe - 0
I would like to merge the deepseekForCausalLM model. Are there any related examples available?
#427 opened by xaiocaibi - 0
Merging Lora fine-tuned models with MoE
#426 opened by AmineBechar07 - 0
Support for Vision Model such as ViT
#423 opened by redagavin - 3
Error at MoE Qwen 1.5B
#395 opened by ehristoforu - 2
Support for xlm-roberta
#422 opened by umiron - 2
"mergekit-yaml" not created upon installation
#421 opened by BovineOverlord - 2
- 1
Example of a config file for task_arithmetic 'negative' operation and a case for 'Task analogies'
#400 opened by eunbin079 - 1
How to use multi GPUs
#420 opened by liudan193 - 1
would you like to support Qwen2.5 Model?
#419 opened by ArcherShirou - 0
Re-Train every block with reduced width
#414 opened by snapo - 1
I am having problem merging GPT-Neo
#409 opened by 2625554780 - 4
The DARE-TIES experiment.
#411 opened by David-AU-github - 0
Broken links on main page - " Arcee App"
#412 opened by David-AU-github - 2
support for GPT-Neo needed!
#408 opened by 2625554780 - 2
Null vocab_file Issue with mistral v03 based models when using union tokenizer source
#394 opened by guillermo-gabrielli-fer - 1
- 1
Merging two mistral based models with different architectures. Looking for some guidance.
#401 opened by AshD - 1
小白怎么合并模型 yaml文件配置
#404 opened by yhyub - 0
怎么解决mergekit-yaml qwen_sail.yaml ./fddfgh/ Warmup loader cache: 100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 2/2 [00:00<00:00, 64.02it/s] Executing graph: 0%| | 0/1457 [00:00<?, ?it/s]Segmentation fault
#403 opened by yhyub - 2
passthrough merge error: Tensor model.layers.86.self_attn.k_norm.weight required but not present in model mistralai/Mistral-Large-Instruct-2407
#398 opened by AshD - 0
Working Example of the Mergkit-Evo
#399 opened by nthangelane - 1
Example case of task_arithmetic needed
#392 opened by Opdoop - 0
MergeKit GUI not working.
#397 opened by Abdulhanan535 - 0
Support for Phi-3-Small [Feature ?]
#396 opened by hammoudhasan - 0
MoE exits itself after expert prompts 100% 2/2
#391 opened by SameedHusayn