xusenlinzy/api-for-open-llm

使用 ptuning_v2 加载 ChatGLM3 模型失败

Closed this issue · 4 comments

提交前必须检查以下项目 | The following items must be checked before submission

  • 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
  • 我已阅读项目文档FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

.env 如下

# 启动端口
PORT=8051

# model 命名
MODEL_NAME=chatglm3
# 将MODEL_PATH改为我们的chatglm3模型所在的文件夹路径
MODEL_PATH=/Algorithm/LLM/ChatGLM3/weights/chatglm3-6b
# PROMPT_NAME=chatglm3

# device related
# GPU设备并行化策略
# DEVICE_MAP=auto
# GPU数量
NUM_GPUs=1
# GPU序号
GPUS='1'

# vllm related
# 开启半精度,可以加快运行速度、减少GPU占用
DTYPE=half

# api related
# API前缀
API_PREFIX=/v1

# API_KEY,此处随意填一个字符串即可
OPENAI_API_KEY='EMPTY'

# ChatGLM3 Tuning
ADAPTER_MODEL_PATH=/Algorithm/LLM/ChatGLM3/output/tool_alpaca_pt-20231205-165625-128-2e-2
USING_PTUNING_V2=True

p-tuning v2 使用 ChatGLM3 官方代码。

Dependencies

# 请在此处粘贴依赖情况
# Please paste the dependencies here

运行日志或截图 | Runtime logs or screenshots

若将 api-for-open-llm/api/adapter/model.py 中第161行改为

use_ptuning_v2 = kwargs.get("use_ptuning_v2", True)

则出现 error

Traceback (most recent call last):
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/server.py", line 2, in <module>
    from api.models import app, EMBEDDED_MODEL, GENERATE_ENGINE
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/models.py", line 142, in <module>
    GENERATE_ENGINE = create_generate_model()
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/models.py", line 48, in create_generate_model
    model, tokenizer = load_model(
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/adapter/model.py", line 250, in load_model
    model, tokenizer = adapter.load_model(
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/adapter/model.py", line 135, in load_model
    model = self.load_adapter_model(model, tokenizer, adapter_model, is_chatglm, config_kwargs, **kwargs)
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/adapter/model.py", line 180, in load_adapter_model
    model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)
  File "/home/zp/.conda/envs/baichuan2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'ChatGLMModel' object has no attribute 'prefix_encoder'

若不修改 api-for-open-llm/api/adapter/model.py 中源代码,则出现 error

Traceback (most recent call last):
  File "/home/zp/.conda/envs/baichuan2/lib/python3.10/site-packages/peft/config.py", line 181, in _get_peft_type
    config_file = hf_hub_download(
  File "/home/zp/.conda/envs/baichuan2/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn
    validate_repo_id(arg_value)
  File "/home/zp/.conda/envs/baichuan2/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_id
    raise HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/Algorithm/LLM/ChatGLM3/output/tool_alpaca_pt-20231205-165625-128-2e-2'. Use `repo_type` argument if needed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/server.py", line 2, in <module>
    from api.models import app, EMBEDDED_MODEL, GENERATE_ENGINE
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/models.py", line 142, in <module>
    GENERATE_ENGINE = create_generate_model()
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/models.py", line 48, in create_generate_model
    model, tokenizer = load_model(
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/adapter/model.py", line 250, in load_model
    model, tokenizer = adapter.load_model(
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/adapter/model.py", line 135, in load_model
    model = self.load_adapter_model(model, tokenizer, adapter_model, is_chatglm, config_kwargs, **kwargs)
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/adapter/model.py", line 183, in load_adapter_model
    model = self.load_lora_model(model, adapter_model, model_kwargs)
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/adapter/model.py", line 152, in load_lora_model
    return PeftModel.from_pretrained(
  File "/home/zp/.conda/envs/baichuan2/lib/python3.10/site-packages/peft/peft_model.py", line 305, in from_pretrained
    PeftConfig._get_peft_type(
  File "/home/zp/.conda/envs/baichuan2/lib/python3.10/site-packages/peft/config.py", line 187, in _get_peft_type
    raise ValueError(f"Can't find '{CONFIG_NAME}' at '{model_id}'")
ValueError: Can't find 'adapter_config.json' at '/Algorithm/LLM/ChatGLM3/output/tool_alpaca_pt-20231205-165625-128-2e-2'

刚刚修复了,拉一下最新代码试试

加载失败,错误如下:

Traceback (most recent call last):
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/server.py", line 2, in <module>
    from api.models import app, EMBEDDED_MODEL, GENERATE_ENGINE
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/models.py", line 142, in <module>
    GENERATE_ENGINE = create_generate_model()
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/models.py", line 48, in create_generate_model
    model, tokenizer = load_model(
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/adapter/model.py", line 316, in load_model
    model, tokenizer = adapter.load_model(
  File "/Algorithm/LLM/Baichuan2/api-for-open-llm/api/adapter/model.py", line 141, in load_model
    model = self.model_class.from_pretrained(
  File "/home/zp/.conda/envs/baichuan2/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/home/zp/.conda/envs/baichuan2/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2966, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/zp/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 856, in __init__
    self.transformer = ChatGLMModel(config, empty_init=empty_init, device=device)
  File "/home/zp/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 765, in __init__
    self.prefix_encoder = PrefixEncoder(config)
  File "/home/zp/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 81, in __init__
    self.embedding = torch.nn.Embedding(config.pre_seq_len,
  File "/home/zp/.conda/envs/baichuan2/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 142, in __init__
    self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs),
TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType), but expected one of:
 * (tuple of ints size, *, tuple of names names, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
 * (tuple of ints size, *, torch.memory_format memory_format, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)

刚刚修改了一个bug,你再更新一下试试哈~

谢谢大佬,加载成功了~