arcee-ai/mergekit

Tools for merging pretrained large language models.

PythonLGPL-3.0

Issues

Qwen2.5 14B models are ... sometimes? ... having their token vocabulary truncated down to 'actual'?
#425 opened 3 months ago by ann-brown
6
Qwen2.5 LoRA Extraction not working in vLLM & Aphrodite Engine
#459 opened a month ago by Nero10578
3
Support for new Llama 3.2 - 1B / 3B ?
#424 opened 2 months ago by David-AU-github
14
KeyError model[0] did not exist in tensor?
#446 opened 2 months ago by FrozzDay
3
N-model ModelStock merging
#453 opened 24 days ago by vishaal27
1
[request] Support for Vision Language Models
#434 opened 2 months ago by NickGao96
13
Rewrite readme more novice-friendly
#462 opened a month ago by clover1980
0
mergekit for vision models
#461 opened a month ago by prince0310
0
Why are the names of parameters hard-coded? Is it possible to read it from index.json in HF checkpoints?
#460 opened a month ago by zhangzx-uiuc
1
Is there a way to run LORA extraction using multi GPU? 70B LORA extraction OOM on 24GB 3090Ti
#393 opened 5 months ago by Nero10578
4
解决运行错误
#402 opened 4 months ago by yhyub
1
Critical Merging Bug just started...
#457 opened 2 months ago by David-AU-github
1
About Model-Breadcrumbs merge implementation
#455 opened 2 months ago by vishaal27
0
Base Model generation time increases when passed through the MergeKit
#454 opened 2 months ago by ahmedamrelhefnawy
0
Moe merging failed
#452 opened 2 months ago by PsoriasiIR
2
mergekit-extract-lora does not extract - the destination is empty
#447 opened 2 months ago by raulod
2
[question] multi gpu available?
#449 opened 2 months ago by eunbin079
0
[question] `task_arithmetic` simple question
#438 opened 2 months ago by eunbin079
2
Report issues regarding the architecture-agnostic branch.
#445 opened 2 months ago by win10ogod
3
[request]Can it support architectures such as stable diffusion Xl and flux dev?
#433 opened 3 months ago by win10ogod
2
RuntimeError: Need to specify cache dir to merge adapters
#442 opened 2 months ago by Zolilio
1
11
#439 opened 2 months ago by meiyiyeshi
0
After the two Qwen1.5-7B-chat models were merged, garbled inference results appeared.
#437 opened 2 months ago by Zhangfanfan0101
0
Broken tokenizer in Yi-34B merge
#428 opened 3 months ago by Asherathe
3
I would like to merge the deepseekForCausalLM model. Are there any related examples available?
#427 opened 3 months ago by xaiocaibi
0
Merging Lora fine-tuned models with MoE
#426 opened 3 months ago by AmineBechar07
0
Support for Vision Model such as ViT
#423 opened 3 months ago by redagavin
0
Error at MoE Qwen 1.5B
#395 opened 4 months ago by ehristoforu
3
Support for xlm-roberta
#422 opened 3 months ago by umiron
2
"mergekit-yaml" not created upon installation
#421 opened 3 months ago by BovineOverlord
2
Input should be a valid dictionary or instance of MergeConfiguration
#418 opened 3 months ago by Hugo-Calero
2
Example of a config file for task_arithmetic 'negative' operation and a case for 'Task analogies'
#400 opened 4 months ago by eunbin079
1
How to use multi GPUs
#420 opened 3 months ago by liudan193
1
would you like to support Qwen2.5 Model?
#419 opened 3 months ago by ArcherShirou
1
Re-Train every block with reduced width
#414 opened 3 months ago by snapo
0
I am having problem merging GPT-Neo
#409 opened 4 months ago by 2625554780
1
The DARE-TIES experiment.
#411 opened 4 months ago by David-AU-github
4
Broken links on main page - " Arcee App"
#412 opened 4 months ago by David-AU-github
0
support for GPT-Neo needed!
#408 opened 4 months ago by 2625554780
2
Null vocab_file Issue with mistral v03 based models when using union tokenizer source
#394 opened 5 months ago by guillermo-gabrielli-fer
2
Is it possible to merge Mistral 7B and Mistral NeMo 12B?
#407 opened 4 months ago by azulika
1
Merging two mistral based models with different architectures. Looking for some guidance.
#401 opened 4 months ago by AshD
1
小白怎么合并模型 yaml文件配置
#404 opened 4 months ago by yhyub
1
怎么解决mergekit-yaml qwen_sail.yaml ./fddfgh/ Warmup loader cache: 100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 2/2 [00:00<00:00, 64.02it/s] Executing graph: 0%| | 0/1457 [00:00<?, ?it/s]Segmentation fault
#403 opened 4 months ago by yhyub
0
passthrough merge error: Tensor model.layers.86.self_attn.k_norm.weight required but not present in model mistralai/Mistral-Large-Instruct-2407
#398 opened 4 months ago by AshD
2
Working Example of the Mergkit-Evo
#399 opened 4 months ago by nthangelane
0
Example case of task_arithmetic needed
#392 opened 5 months ago by Opdoop
1
MergeKit GUI not working.
#397 opened 4 months ago by Abdulhanan535
0
Support for Phi-3-Small [Feature ?]
#396 opened 5 months ago by hammoudhasan
0
MoE exits itself after expert prompts 100% 2/2
#391 opened 5 months ago by SameedHusayn
0