[Bug]: 可能是由于transformers库版本导致的chatglm无法使用问题

Question

[Bug]: 可能是由于transformers库版本导致的chatglm无法使用问题

Opened this issue 2 months ago · 1 comments

wold9168 commented 2 months ago

Installation Method | 安装方法与平台

Docker（Linux）

Version | 版本

Latest | 最新版

OS | 操作系统

Linux

Describe the bug | 简述

使用Debian 12+NVIDIA M40 24G的组合进行该项目的部署，使用的部署方式是Docker的方案0。

使用chatglm时遇到日志如下：

Traceback (most recent call last):
  File "./request_llms/local_llm_class.py", line 159, in run
    for response_full in self.llm_stream_generator(**kwargs):
  File "./request_llms/bridge_chatglm.py", line 59, in llm_stream_generator
    for response, history in self._model.stream_chat(self._tokenizer,
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm2-6b/7fabe56db91e085c9c027f56f1c654d137bdba40/modeling_chatglm.py", line 1063, in stream_chat
    for outputs in self.stream_generate(**inputs, past_key_values=past_key_values,
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm2-6b/7fabe56db91e085c9c027f56f1c654d137bdba40/modeling_chatglm.py", line 1142, in stream_generate
    logits_warper = self._get_logits_warper(generation_config)
TypeError: _get_logits_warper() missing 1 required positional argument: 'device'

chatglm无法顺利加载。

Screen Shot | 有帮助的截图

Terminal Traceback & Material to Help Reproduce Bugs | 终端traceback（如有） + 帮助我们复现的测试材料样本（如有）

docker-compose.yml文件内容如下：

## ===================================================
#                docker-compose.yml
## ===================================================
# 1. 请在以下方案中选择任意一种，然后删除其他的方案
# 2. 修改你选择的方案中的environment环境变量，详情请见github wiki或者config.py
# 3. 选择一种暴露服务端口的方法，并对相应的配置做出修改：
    # 「方法1: 适用于Linux，很方便，可惜windows不支持」与宿主的网络融合为一体，这个是默认配置
    # network_mode: "host"
    # 「方法2: 适用于所有系统包括Windows和MacOS」端口映射，把容器的端口映射到宿主的端口（注意您需要先删除network_mode: "host"，再追加以下内容）
    # ports:
    #   - "12345:12345"  # 注意！12345必须与WEB_PORT环境变量相互对应
# 4. 最后`docker-compose up`运行
# 5. 如果希望使用显卡，请关注 LOCAL_MODEL_DEVICE 和 英伟达显卡运行时 选项
## ===================================================
# 1. Please choose one of the following options and delete the others.
# 2. Modify the environment variables in the selected option, see GitHub wiki or config.py for more details.
# 3. Choose a method to expose the server port and make the corresponding configuration changes:
    # [Method 1: Suitable for Linux, convenient, but not supported for Windows] Fusion with the host network, this is the default configuration
    # network_mode: "host"
    # [Method 2: Suitable for all systems including Windows and MacOS] Port mapping, mapping the container port to the host port (note that you need to delete network_mode: "host" first, and then add the following content)
    # ports:
    # - "12345: 12345" # Note! 12345 must correspond to the WEB_PORT environment variable.
# 4. Finally, run `docker-compose up`.
# 5. If you want to use a graphics card, pay attention to the LOCAL_MODEL_DEVICE and Nvidia GPU runtime options.
## ===================================================

## ===================================================
## 「方案零」 部署项目的全部能力（这个是包含cuda和latex的大型镜像。如果您网速慢、硬盘小或没有显卡，则不推荐使用这个）
## ===================================================
version: '3'
services:
  gpt_academic_full_capability:
    image: ghcr.io/binary-husky/gpt_academic_with_all_capacity:master
    environment:
      # 请查阅 `config.py`或者 github wiki 以查看所有的配置信息
      API_KEY:                  '  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                        '
      # USE_PROXY:                '  True                                                                       '
      # proxies:                  '  { "http": "http://localhost:10881", "https": "http://localhost:10881", }   '
      LLM_MODEL:                '  gpt-3.5-turbo                                                              '
      AVAIL_LLM_MODELS:         '  ["gpt-3.5-turbo", "gpt-4", "qianfan", "sparkv2", "spark", "chatglm"]       '
      BAIDU_CLOUD_API_KEY :     '  bTUtwEAveBrQipEowUvDwYWq                                                   '
      BAIDU_CLOUD_SECRET_KEY :  '  jqXtLvXiVw6UNdjliATTS61rllG8Iuni                                           '
      XFYUN_APPID:              '  53a8d816                                                                   '
      XFYUN_API_SECRET:         '  MjMxNDQ4NDE4MzM0OSNlNjQ2NTlhMTkx                                           '
      XFYUN_API_KEY:            '  95ccdec285364869d17b33e75ee96447                                           '
      ENABLE_AUDIO:             '  False                                                                      '
      DEFAULT_WORKER_NUM:       '  20                                                                         '
      WEB_PORT:                 '  12345                                                                      '
      ADD_WAIFU:                '  False                                                                      '
      ALIYUN_APPKEY:            '  RxPlZrM88DnAFkZK                                                           '
      THEME:                    '  Chuanhu-Small-and-Beautiful                                                '
      ALIYUN_ACCESSKEY:         '  LTAI5t6BrFUzxRXVGUWnekh1                                                   '
      ALIYUN_SECRET:            '  eHmI20SVWIwQZxCiTD2bGQVspP9i68                                             '
      LOCAL_MODEL_DEVICE:       '  cuda                                                                       '

    # 加载英伟达显卡运行时
    runtime: nvidia
    deploy:
        resources:
          reservations:
            devices:
              - driver: nvidia
                count: 1
                capabilities: [gpu]

    # 「WEB_PORT暴露方法1: 适用于Linux」与宿主的网络融合
    network_mode: "host"

    # 「WEB_PORT暴露方法2: 适用于所有系统」端口映射
    # ports:
    #   - "12345:12345"  # 12345必须与WEB_PORT相互对应

    # 启动容器后，运行main.py主程序
    command: >
      bash -c "python3 -u main.py"

Answer 1 · 2024-07-25T17:36:46.000Z

注意到，chatglm的仓库下有这一disscussion：THUDM/ChatGLM2-6B#682

该disscussion指出，一些版本的python的transformers包会使得chatglm2-6b模型无法运行。

而本repo的requirements恰好指定：transformers>=4.27.1。这一依赖要求可能会导致上游推送版本过于靠后的包用于构建容器。

我在尝试修改requirements.txt并重构容器以解决此问题。

（reply的时候手贱点到close了，orz）

重构容器之前发现上游给我推送了glm的新版本，先看看有没有把这个问题修掉罢。（