PaddlePaddle/PaddleNLP

[Question]:chatglmv2无法正确初始化

Closed this issue · 3 comments

请提出你的问题

Some weights of ChatGLMv2ForCausalLM were not initialized from the model checkpoint at /home/.paddlenlp/models/THUDM/chatglm2-6b and are newly initialized: ['encoder.layers.2.self_attention.key.weight', 'encoder.layers.0.self_attention.value.bias', 'encoder.layers.0.self_attention.key.weight', 'encoder.layers.11.self_attention.key.weight', 'encoder.layers.15.self_attention.query.weight', 'encoder.layers.24.self_attention.value.bias', 'encoder.layers.7.self_attention.key.weight', 'encoder.layers.24.self_attention.key.weight', 'encoder.layers.19.self_attention.query.weight', 'encoder.layers.11.self_attention.query.weight', 'encoder.layers.20.self_attention.key.bias', 'encoder.layers.23.self_attention.query.bias', 'encoder.layers.25.self_attention.query.weight', 'encoder.layers.4.self_attention.key.bias', 'encoder.layers.6.self_attention.value.bias', 'encoder.layers.16.self_attention.value.bias', 'encoder.layers.17.self_attention.key.bias', 'encoder.layers.21.self_attention.query.weight', 'encoder.layers.24.self_attention.query.weight', 'encoder.layers.26.self_attention.query.bias', 'encoder.layers.23.self_attention.key.bias', 'encoder.layers.23.self_attention.query.weight', 'encoder.layers.21.self_attention.key.weight', 'encoder.layers.25.self_attention.key.weight', 'encoder.layers.27.self_attention.value.bias', 'encoder.layers.2.self_attention.key.bias', 'encoder.layers.25.self_attention.value.weight', 'encoder.layers.20.self_attention.value.bias', 'encoder.layers.18.self_attention.key.weight', 'encoder.layers.12.self_attention.query.bias', 'encoder.layers.14.self_attention.query.bias', 'encoder.layers.5.self_attention.key.bias', 'encoder.layers.24.self_attention.value.weight', 'encoder.layers.17.self_attention.query.weight', 'encoder.layers.7.self_attention.value.weight', 'encoder.layers.18.self_attention.value.weight', 'encoder.layers.22.self_attention.query.weight', 'encoder.layers.12.self_attention.key.weight', 'encoder.layers.17.self_attention.value.bias', 'encoder.layers.13.self_attention.query.bias', 'encoder.layers.22.self_attention.key.bias', 'encoder.layers.1.self_attention.key.bias', 'encoder.layers.5.self_attention.key.weight', 'encoder.layers.26.self_attention.query.weight', 'encoder.layers.12.self_attention.query.weight', 'encoder.layers.0.self_attention.query.weight', 'encoder.layers.16.self_attention.query.weight', 'encoder.layers.27.self_attention.query.bias', 'encoder.layers.3.self_attention.query.weight', 'encoder.layers.25.self_attention.key.bias', 'encoder.layers.1.self_attention.query.weight', 'encoder.layers.5.self_attention.value.bias', 'encoder.layers.21.self_attention.query.bias', 'encoder.layers.17.self_attention.value.weight', 'encoder.layers.10.self_attention.key.weight', 'encoder.layers.22.self_attention.key.weight', 'encoder.layers.19.self_attention.key.bias', 'encoder.layers.24.self_attention.query.bias', 'encoder.layers.24.self_attention.key.bias', 'encoder.layers.21.self_attention.key.bias', 'encoder.layers.22.self_attention.query.bias', 'encoder.layers.6.self_attention.key.weight', 'encoder.layers.4.self_attention.value.bias', 'encoder.layers.13.self_attention.query.weight', 'encoder.layers.11.self_attention.query.bias', 'encoder.layers.2.self_attention.value.weight', 'encoder.layers.9.self_attention.key.bias', 'encoder.layers.26.self_attention.key.bias', 'encoder.layers.2.self_attention.query.weight', 'encoder.layers.3.self_attention.value.weight', 'encoder.layers.15.self_attention.value.bias', 'encoder.layers.22.self_attention.value.bias', 'encoder.layers.27.self_attention.key.weight', 'encoder.layers.13.self_attention.value.weight', 'encoder.layers.1.self_attention.value.weight', 'encoder.layers.27.self_attention.query.weight', 'encoder.layers.14.self_attention.query.weight', 'encoder.layers.9.self_attention.query.weight', 'encoder.layers.25.self_attention.query.bias', 'encoder.layers.12.self_attention.value.weight', 'encoder.layers.4.self_attention.query.weight', 'encoder.layers.17.self_attention.query.bias', 'encoder.layers.14.self_attention.value.weight', 'encoder.layers.10.self_attention.query.weight', 'encoder.layers.18.self_attention.query.weight', 'encoder.layers.3.self_attention.query.bias', 'encoder.layers.8.self_attention.query.bias', 'encoder.layers.2.self_attention.value.bias', 'encoder.layers.9.self_attention.query.bias', 'encoder.layers.27.self_attention.value.weight', 'encoder.layers.1.self_attention.value.bias', 'encoder.layers.10.self_attention.query.bias', 'encoder.layers.7.self_attention.value.bias', 'encoder.layers.9.self_attention.value.bias', 'encoder.layers.27.self_attention.key.bias', 'encoder.layers.5.self_attention.query.weight', 'encoder.layers.17.self_attention.key.weight', 'encoder.layers.25.self_attention.value.bias', 'encoder.layers.8.self_attention.query.weight', 'encoder.layers.19.self_attention.query.bias', 'encoder.layers.22.self_attention.value.weight', 'encoder.layers.12.self_attention.value.bias', 'encoder.layers.20.self_attention.query.weight', 'encoder.layers.12.self_attention.key.bias', 'encoder.layers.26.self_attention.value.bias', 'encoder.layers.0.self_attention.value.weight', 'encoder.layers.8.self_attention.value.weight', 'encoder.layers.11.self_attention.value.bias', 'encoder.layers.7.self_attention.query.bias', 'encoder.layers.23.self_attention.key.weight', 'encoder.layers.21.self_attention.value.weight', 'encoder.layers.14.self_attention.key.weight', 'encoder.layers.9.self_attention.value.weight', 'encoder.layers.8.self_attention.key.weight', 'encoder.layers.7.self_attention.key.bias', 'encoder.layers.13.self_attention.key.bias', 'encoder.layers.6.self_attention.query.weight', 'encoder.layers.11.self_attention.key.bias', 'encoder.layers.3.self_attention.key.weight', 'encoder.layers.15.self_attention.value.weight', 'encoder.layers.3.self_attention.key.bias', 'encoder.layers.9.self_attention.key.weight', 'encoder.layers.16.self_attention.key.weight', 'encoder.layers.10.self_attention.key.bias', 'encoder.layers.1.self_attention.query.bias', 'encoder.layers.5.self_attention.value.weight', 'encoder.layers.20.self_attention.query.bias', 'encoder.layers.18.self_attention.query.bias', 'encoder.layers.20.self_attention.key.weight', 'encoder.layers.14.self_attention.value.bias', 'encoder.layers.13.self_attention.key.weight', 'encoder.layers.4.self_attention.value.weight', 'encoder.layers.7.self_attention.query.weight', 'encoder.layers.16.self_attention.value.weight', 'encoder.layers.10.self_attention.value.bias', 'encoder.layers.21.self_attention.value.bias', 'encoder.layers.23.self_attention.value.weight', 'encoder.layers.26.self_attention.key.weight', 'encoder.layers.18.self_attention.value.bias', 'encoder.layers.6.self_attention.query.bias', 'encoder.layers.8.self_attention.value.bias', 'encoder.layers.18.self_attention.key.bias', 'encoder.layers.4.self_attention.query.bias', 'encoder.layers.3.self_attention.value.bias', 'encoder.layers.4.self_attention.key.weight', 'encoder.layers.20.self_attention.value.weight', 'encoder.layers.8.self_attention.key.bias', 'encoder.layers.19.self_attention.value.bias', 'encoder.layers.11.self_attention.value.weight', 'encoder.layers.6.self_attention.value.weight', 'encoder.layers.0.self_attention.query.bias', 'encoder.layers.5.self_attention.query.bias', 'encoder.layers.2.self_attention.query.bias', 'encoder.layers.15.self_attention.key.weight', 'encoder.layers.0.self_attention.key.bias', 'encoder.layers.26.self_attention.value.weight', 'encoder.layers.19.self_attention.key.weight', 'encoder.layers.13.self_attention.value.bias', 'encoder.layers.19.self_attention.value.weight', 'encoder.layers.1.self_attention.key.weight', 'encoder.layers.23.self_attention.value.bias', 'encoder.layers.15.self_attention.query.bias', 'encoder.layers.14.self_attention.key.bias', 'encoder.layers.6.self_attention.key.bias', 'encoder.layers.16.self_attention.query.bias', 'encoder.layers.10.self_attention.value.weight', 'encoder.layers.15.self_attention.key.bias', 'encoder.layers.16.self_attention.key.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

权重文件是下载的model_state.pdparams,但是模型无法正确初始化,进而无法作出正确的预测

请问你用的是什么环境,我测了一下没什么问题:

Python 3.9.16 (main, Dec  7 2022, 01:11:58) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from paddlenlp.transformers import AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("THUDM/chatglm2-6b")
(…)/community/THUDM/chatglm2-6b/config.json: 100%|███████████████████████████████████████████████████████████| 885/885 [00:00<00:00, 141kB/s]
[2024-04-30 14:16:22,828] [    INFO] - We are using <class 'paddlenlp.transformers.chatglm_v2.modeling.ChatGLMv2ForCausalLM'> to load 'THUDM/chatglm2-6b'.
[2024-04-30 14:16:22,828] [    INFO] - Loading configuration file /root/.paddlenlp/models/THUDM/chatglm2-6b/config.json
(…)y/THUDM/chatglm2-6b/model_state.pdparams: 100%|██████████████████████████████████████████████████████| 12.5G/12.5G [02:54<00:00, 71.4MB/s]
[2024-04-30 14:19:18,048] [    INFO] - Loading weights file from cache at /root/.paddlenlp/models/THUDM/chatglm2-6b/model_state.pdparams
[2024-04-30 14:19:30,138] [    INFO] - Loaded weights file from disk, setting weights to model.
W0430 14:19:30.150593  2094 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 11.8, Runtime API Version: 11.8
W0430 14:19:30.174417  2094 gpu_resources.cc:164] device: 0, cuDNN Version: 8.6.
[2024-04-30 14:19:45,617] [    INFO] - All model checkpoint weights were used when initializing ChatGLMv2ForCausalLM.

[2024-04-30 14:19:45,618] [    INFO] - All the weights of ChatGLMv2ForCausalLM were initialized from the model checkpoint at THUDM/chatglm2-6b.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMv2ForCausalLM for predictions without further training.
[2024-04-30 14:19:45,642] [    INFO] - Generation config file not found, using a generation config created from the model config.

您好,我的版本如下
paddle:2.6.1
paddlenlp:2.6.1
cuda:12.0
ubuntu:22.04

解决了,谢谢大佬,把python版本降到3.9.16就好了