Qwen2-7B-Instruct 导出 onnx 报错

Question

Qwen2-7B-Instruct 导出 onnx 报错

zifeng-radxa opened this issue 5 months ago · 1 comments

已经替换 .venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py
在导出 onnx 出现报错

(.venv) root@f9ba65967758:/workspace/test/LLM-TPU/models/Qwen2/compile# python3 export_onnx.py --model_path /workspace/Qwen2-7B-Instruct/ --seq_length 1024 --device cpu
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:28<00:00,  7.02s/it]
Layers: 28
Hidden size: 3584

Convert block & block_cache
  0%|                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py:120: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if seq_len > self.max_seq_len_cached:
  0%|                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/workspace/test/LLM-TPU/models/Qwen2/compile/export_onnx.py", line 251, in <module>
    convert_block(i)
  File "/workspace/test/LLM-TPU/models/Qwen2/compile/export_onnx.py", line 162, in convert_block
    torch.onnx.export(
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 516, in export
    _export(
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1612, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1134, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1010, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 914, in _trace_and_get_graph_from_model
    trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/jit/_trace.py", line 1310, in _get_trace_graph
    outs = ONNXTracedModule(
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/jit/_trace.py", line 138, in forward
    graph, out = torch._C._create_graph_by_tracing(
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/jit/_trace.py", line 129, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/compile/export_onnx.py", line 73, in forward
    hidden_states, past_kv = self.layer(
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 786, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 695, in forward
    query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin, position_ids)
  File "/workspace/test/LLM-TPU/models/Qwen2/.venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 171, in apply_rotary_pos_emb
    q_embed = (q * cos) + (rotate_half(q) * sin)
RuntimeError: The size of tensor a (28) must match the size of tensor b (1024) at non-singleton dimension

Answer 1 · 2024-07-03T08:56:35.000Z

pip3 install torch==2.0.1+cpu torchvision==0.15.2 -f https://download.pytorch.org/whl/cpu/torch_stable.html