okuvshynov/slowllama

run prepare_model.py error

Closed this issue · 4 comments

when I use CodeLlama-7b to run prepare_model.py,An exception occurred

RuntimeError: The expanded size of the tensor (32000) must match the existing size (32016) at non-singleton dimension 0. Target sizes: [32000, 4096]. Tensor sizes: [32016, 4096]

This is regular CodeLLama-7b, downloaded from https://llama.meta.com/llama-downloads/, right?

Yes, that's right

FROM this repo https://github.com/facebookresearch/codellama

my makbook is M1

this is my prepare_model.py

import torch
import logging

from loader import prepare_model
from conf_fp16 import *

logging.basicConfig(format='%(asctime)s %(message)s',
                    level=logging.DEBUG, filename='logs/prepare_model.log')
torch.random.manual_seed(seed)

prepare_model(
    llama2_path='/Users/bzhang1/Desktop/codellama/CodeLlama-7b',
    frozen_path='/Users/bzhang1/Desktop/codellama/CodeLlama-7b',
    compute_dtype=compute_dtype,
    lora_rank=lora_rank, frozen_dtype=frozen_dtype)

this is the error message:

Traceback (most recent call last):
  File "/Users/bzhang1/Desktop/codellama/slowllama/prepare_model.py", line 13, in <module>
    prepare_model(llama2_path='/Users/bzhang1/Desktop/codellama/CodeLlama-7b', frozen_path='/Users/bzhang1/Desktop/codellama/CodeLlama-7b', compute_dtype=compute_dtype,
  File "/Users/bzhang1/Desktop/codellama/slowllama/loader.py", line 120, in prepare_model
    apply_subset(block, checkpoint[f'{title}.weight'], ci, title)
  File "/Users/bzhang1/Desktop/codellama/slowllama/loader.py", line 51, in apply_subset
    module.weight[idx_subset] = weight_subset
    ~~~~~~~~~~~~~^^^^^^^^^^^^
RuntimeError: The expanded size of the tensor (32000) must match the existing size (32016) at non-singleton dimension 0.  Target sizes: [32000, 4096].  Tensor sizes: [32016, 4096]

Hi, sorry for delay, did you figure it out? May be related to #16 ?

thank you much for your reply,I changed the model and the problem was solved.
I used CodeLlama-7b before,and when I use CodeLlama-7b-python,this error was gone