karpathy/llama2.c

run export.py with chat-llama-2-7b-chat-hf, then memory is over.

Swair opened this issue · 1 comments

Swair commented

$ python export.py llama2_7b.bin --hf chat-llama-2-7b-chat-hf
Traceback (most recent call last):
File "E:\ai_ws\llama_ws\llama2.c\export.py", line 465, in
model = load_hf_model(args.hf)
File "E:\ai_ws\llama_ws\llama2.c\export.py", line 363, in load_hf_model
hf_model = AutoModelForCausalLM.from_pretrained(model_path)
File "E:\ai_ws\miniconda3\envs\llama_env\lib\site-packages\transformers\models\auto\auto_factory.py", line 516, in from_pretrained
return model_class.from_pretrained(
File "E:\ai_ws\miniconda3\envs\llama_env\lib\site-packages\transformers\modeling_utils.py", line 2876, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "E:\ai_ws\miniconda3\envs\llama_env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 736, in init
self.model = LlamaModel(config)
File "E:\ai_ws\miniconda3\envs\llama_env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 566, in init
self.layers = nn.ModuleList([LlamaDecoderLayer(config) for _ in range(config.num_hidden_layers)])
File "E:\ai_ws\miniconda3\envs\llama_env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 566, in
self.layers = nn.ModuleList([LlamaDecoderLayer(config) for _ in range(config.num_hidden_layers)])
File "E:\ai_ws\miniconda3\envs\llama_env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 381, in init
self.mlp = LlamaMLP(config)
File "E:\ai_ws\miniconda3\envs\llama_env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 198, in init
self.up_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=False)
File "E:\ai_ws\miniconda3\envs\llama_env\lib\site-packages\torch\nn\modules\linear.py", line 96, in init
self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 180355072 bytes.
(llama_env)

I used an old script and it worked: #341 (comment)