passthrough merge error: Tensor model.layers.86.self_attn.k_norm.weight required but not present in model mistralai/Mistral-Large-Instruct-2407
AshD opened this issue · 2 comments
AshD commented
Mergekit (8/18/24) : Trying to create a passthrough merge and it fails with this error
RuntimeError: Tensor model.layers.86.self_attn.k_norm.weight required but not present in model mistralai/Mistral-Large-Instruct-2407
mergekit-config is
dtype: bfloat
merge_method: passthrough
slices:
- sources:
- layer_range: [0, 30]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [5, 35]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [11, 31]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [15, 35]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [22, 42]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [25, 45]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [33, 53]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [40, 80]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [44, 87]
model: mistralai/Mistral-Large-Instruct-2407
Output
mergekit-yaml ./mergekit_config_mistral.yml ./models --cuda --allow-crimes --lazy-unpickle
Fetching 110 files: 100%|█████████████████████████████████████████████████████████| 110/110 [00:00<00:00, 202801.51it/s]
Warmup loader cache: 100%|████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.54it/s]
Executing graph: 0%| | 1/8990 [00:00<00:15, 578.60it/s]
Traceback (most recent call last):
File "/home/ash/miniconda3/envs/py310/bin/mergekit-yaml", line 8, in <module>
sys.exit(main())
File "/home/ash/miniconda3/envs/py310/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/home/ash/miniconda3/envs/py310/lib/python3.9/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/ash/miniconda3/envs/py310/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ash/miniconda3/envs/py310/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/ash/ai/mergekit/mergekit/options.py", line 82, in wrapper
f(*args, **kwargs)
File "/home/ash/ai/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
run_merge(
File "/home/ash/ai/mergekit/mergekit/merge.py", line 96, in run_merge
for _task, value in exec.run(quiet=options.quiet):
File "/home/ash/ai/mergekit/mergekit/graph.py", line 197, in run
res = task.execute(**arguments)
File "/home/ash/ai/mergekit/mergekit/io/tasks.py", line 86, in execute
raise RuntimeError(
RuntimeError: Tensor model.layers.86.self_attn.k_norm.weight required but not present in model mistralai/Mistral-Large-Instruct-2407
metric-space commented
Perhaps I am gravely mistaken, but is there any chance the mistral json file that defines the mistral architecture has been modified on your end? self_attn.k_norm.weight
in addition to being an odd weight name doesn't exist here https://github.com/arcee-ai/mergekit/blob/main/mergekit/_data/architectures/mistral.json
AshD commented
Thanks. That was it.