xusenlinzy/api-for-open-llm

Qwen1.5推理报错RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous

Opened this issue · 6 comments

提交前必须检查以下项目 | The following items must be checked before submission

  • 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
  • 我已阅读项目文档FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Windows

详细描述问题 | Detailed description of the problem

# 请在此处粘贴运行代码(如没有可删除该代码块)
# Paste the runtime code here (delete the code block if you don't have it)
PORT=8053

# model related
MODEL_NAME=qwen2
MODEL_PATH=D:/projects/Qwen/models/Qwen1.5-14B-Chat
PROMPT_NAME=qwen2
EMBEDDING_NAME=D:/projects/Qwen/models/m3e-base
ADAPTER_MODEL_PATH=
QUANTIZE=16
CONTEXT_LEN=1200
LOAD_IN_8BIT=false
LOAD_IN_4BIT=false
USING_PTUNING_V2=false
STREAM_INTERVERL=2

# device related
DEVICE=cuda

# "auto", "cuda:0", "cuda:1", ...
DEVICE_MAP=auto
GPUS=
NUM_GPUs=1
DTYPE=half


# api related
API_PREFIX=/v1

USE_STREAMER_V2=false
ENGINE=default

Dependencies

# 请在此处粘贴依赖情况
# Please paste the dependencies here
transformers-4.38.1
其他都与项目requirements一致

运行日志或截图 | Runtime logs or screenshots

# 请在此处粘贴运行日志
# Please paste the run log here
INFO:     47.90.164.232:0 - "POST /v1/chat/completions HTTP/1.1" 200 OK
Traceback (most recent call last):
  File "D:\projects\api-for-open-llm\api-for-open-llm\api\core\default.py", line 281, in _generate
    for output in self.generate_stream_func(self.model, self.tokenizer, params):
  File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\utils\_contextlib.py", line 35, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "D:\projects\api-for-open-llm\api-for-open-llm\api\generation\stream.py", line 85, in generate_stream
    out = model(torch.as_tensor([input_ids], device=device), use_cache=True)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\accelerate\hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\transformers\models\qwen2\modeling_qwen2.py", line 1173, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\transformers\models\qwen2\modeling_qwen2.py", line 998, in forward
    position_ids = position_ids.unsqueeze(0).view(-1, seq_length)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous

1709117704093

试试USE_STREAMER_V2=true

试试USE_STREAMER_V2=true

不行,还是报这个错

请问目前解决这个问题了吗?

qwen2 也是这个错误

有示例吗,我这里复现不了这个错误

换了USE_STREAMER_V2=true 好像好了,暂时没出现了