Qwen2-7B-Instruct 导出 onnx 报错
zifeng-radxa opened this issue · 1 comments
zifeng-radxa commented
已经替换 .venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py
在导出 onnx 出现报错
(.venv) root@f9ba65967758:/workspace/test/LLM-TPU/models/Qwen2/compile# python3 export_onnx.py --model_path /workspace/Qwen2-7B-Instruct/ --seq_length 1024 --device cpu
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:28<00:00, 7.02s/it]
Layers: 28
Hidden size: 3584
Convert block & block_cache
0%| | 0/1 [00:00<?, ?it/s]/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py:120: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if seq_len > self.max_seq_len_cached:
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/workspace/test/LLM-TPU/models/Qwen2/compile/export_onnx.py", line 251, in <module>
convert_block(i)
File "/workspace/test/LLM-TPU/models/Qwen2/compile/export_onnx.py", line 162, in convert_block
torch.onnx.export(
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 516, in export
_export(
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1612, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1134, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1010, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 914, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/jit/_trace.py", line 1310, in _get_trace_graph
outs = ONNXTracedModule(
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/jit/_trace.py", line 138, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/jit/_trace.py", line 129, in wrapper
outs.append(self.inner(*trace_inputs))
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _slow_forward
result = self.forward(*input, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/compile/export_onnx.py", line 73, in forward
hidden_states, past_kv = self.layer(
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _slow_forward
result = self.forward(*input, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 786, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _slow_forward
result = self.forward(*input, **kwargs)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 695, in forward
query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin, position_ids)
File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 171, in apply_rotary_pos_emb
q_embed = (q * cos) + (rotate_half(q) * sin)
RuntimeError: The size of tensor a (28) must match the size of tensor b (1024) at non-singleton dimension
chuxiaoyi2023 commented
pip3 install torch==2.0.1+cpu torchvision==0.15.2 -f https://download.pytorch.org/whl/cpu/torch_stable.html