Moe merging failed
Opened this issue · 2 comments
I encountered an error while trying to merge two Qwen-based lora models using a mixture of experts (MoE) configuration with qwen architecture. I’m working with a phi2_moe2.yml configuration file, but the system throws an error related to a missing field (merge_method).
Configuration and Setup
I am using the following configuration yml:
base_model: CMLM/ZhongJing-2-1_8b
gate_mode: hidden # one of "hidden", "cheap_embed", or "random"
#dtype: float16 # output dtype (float32, float16, or bfloat16)
experts:
- source_model: CMLM/ZhongJing-2-1_8b
positive_prompts: []
- source_model: Qwen2.5-1.5B-Instruct
positive_prompts: []
When I run this setup, I get the following error:
[2024-11-04 18:51:10] [ERROR] Invalid yaml 1 validation error for MergeConfiguration
merge_method
Field required [type=missing, input_value={'base_model': 'CMLM/ZhongJing-2-1_8b', 'gate_mode': 'hidden', 'experts': [{'source_model': 'CMLM/ZhongJing-2-1_8b', 'positive_prompts': []}, {'source_model': 'Qwen2.5-1.5B-Instruct', 'positive_prompts': []}]}]
Attempted Solutions
I suspect adding merge_method might resolve the issue, but I’m not sure what options are available for this field. I would appreciate guidance on:
Complete yml file for qwen moe merge_method
Documentation or examples: Are there any detailed examples or documentation that explain each field in the YAML configuration for MoE?
Additional Context
First model: CMLM/ZhongJing-2-1_8b
Second model: Qwen2.5-1.5B-Instruct
Thank you for your assistance!
It looks like you're using the mergekit-yaml
command. For this type of config you want to use mergekit-moe
.
In addition, this particular merge probably won't work - the two models you are looking at aren't the same size, so they will not be compatible.
I hope this message finds you well. Specifically, I have the following models:
Base Model: CMLM/ZhongJing-2-1_8b
Fine-Tuned Model: CMLM/ZhongJing-2-1_8b_finetuned based on Qwen-1.5-1.8B-Chat
Current Challenge: When attempting to merge these models using a YML configuration, I continue encounter the error.
Could you provide an example of a correctly structured YML file for merging these models? Despite following available guidelines, attempts to merge via your space result in errors.
Attempted Configuration: Here's the YML configuration I used:
yml
base_model: CMLM/ZhongJing-2-1_8b
gate_mode: hidden
dtype: bfloat16
experts:
- source_model: CMLM/ZhongJing-2-1_8b
positive_prompts:
- "Human: 请从中医角度分析以下症状。\nAssistant: 好的,我会从中医理论出发,通过望闻问切的方法进行分析。"
- "Human: 这些症状在中医理论中属于什么证型?\nAssistant: 让我根据中医辨证论治的原则来分析。"
- "请解释一下中医的阴阳五行理论如何解释这个症状。"
- "从中医角度来看,这些食材的性质和功效是什么?"
- "这些中药的配伍原则是什么?"
negative_prompts:
- "What's the molecular mechanism of this drug?"
- "Please explain the pathophysiology of this condition."
- source_model: Qwen-1.5-1.8B-Chat
positive_prompts:
- "Based on modern medical research, what's the diagnosis?"
- "What are the evidence-based treatment options for this condition?"
- "Please explain the pathophysiological mechanism."
- "What laboratory tests should be ordered?"
- "According to clinical guidelines, what's the recommended treatment protocol?"
negative_prompts:
- "从阴阳五行的角度分析"
- "请解释一下这个症状的中医证型"
Thank you very much for your time and assistance. I look forward to your guidance to resolve this merging issue effectively.