ypeleg/llama

KeyError: 'decoder.layers.35.attention_norm.weight' when running inference

Closed this issue · 1 comments

The code did work properly but now for the same call I get this key error.

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ │ │ │ 8 │ │ 9 tokenizer = llama.LLaMATokenizer.from_pretrained(MODEL) │ │ 10 #model = llama.LLaMAForCausalLM.from_pretrained(MODEL, low_cpu_mem_usage = True) │ │ ❱ 11 model = llama.LLaMAForCausalLM.from_pretrained(MODEL, low_cpu_mem_usage=True, device_map │ │ 12 │ │ 13 model.to('cuda') │ │ 14 │ │ │ │ /opt/conda/envs/env/lib/python3.9/site-packages/transformers/modeling_utils.py:2326 in │ │ from_pretrained │ │ │ │ 2323 │ │ │ if dtype_orig is not None: │ │ 2324 │ │ │ │ torch.set_default_dtype(dtype_orig) │ │ 2325 │ │ │ │ │ ❱ 2326 │ │ │ model, missing_keys, unexpected_keys, mismatched_keys, error_msgs = cls._loa │ │ 2327 │ │ │ │ model, │ │ 2328 │ │ │ │ state_dict, │ │ 2329 │ │ │ │ loaded_state_dict_keys, # XXX: rename? │ │ │ │ /opt/conda/envs/env/lib/python3.9/site-packages/transformers/modeling_utils.py:2448 in │ │ _load_pretrained_model │ │ │ │ 2445 │ │ │ for key in missing_keys: │ │ 2446 │ │ │ │ if key.startswith(prefix): │ │ 2447 │ │ │ │ │ key = ".".join(key.split(".")[1:]) │ │ ❱ 2448 │ │ │ │ param = model_state_dict[key] │ │ 2449 │ │ │ │ if param.device == torch.device("meta"): │ │ 2450 │ │ │ │ │ if not load_in_8bit: │ │ 2451 │ │ │ │ │ │ set_module_tensor_to_device(model, key, "cpu", torch.empty(*para │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ KeyError: 'decoder.layers.35.attention_norm.weight

Thanks for reporting the issue, fixed now!