luchangli03/export_llama_to_onnx

用3090 导出7b 和13b llama2 报oom

Closed this issue · 1 comments

用3090 导出7b 和13b llama2 报oom 遇到过没?

Traceback (most recent call last):
File "export_llama_to_onnx/export_llama_single.py", line 193, in
export_llama(args)
File "export_llama_to_onnx/export_llama_single.py", line 171, in export_llama
export_llama_to_single_onnx(model, config, dtype, args, "llama_onnx")
File "export_llama_to_onnx/export_llama_single.py", line 109, in export_llama_to_single_onnx
torch.onnx.export(
File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 506, in export
_export(
File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 1548, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 1113, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 989, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 893, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
File "/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py", line 1268, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py", line 114, in wrapper
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
File "/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py", line 114, in
tuple(x.clone(memory_format=torch.preserve_format) for x in args)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB (GPU 0; 23.69 GiB total capacity; 23.36 GiB already allocated; 34.06 MiB free; 23.36 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

内存不够,麻烦用CPU或者用更大显存的GPU,另外安装pip install accelerate看看有无效果。