加载不了模型提示 No compiled kernel found

Question

加载不了模型提示 No compiled kernel found

Zhou-Yujie opened this issue 2 years ago · 3 comments

我用的 wsl 加载模型时候总是失败然后程序暂停。

/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.16) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
2023-05-28 21:02:00.484 | INFO | main::16 - CONTENT_DIR: /mnt/c/Users/MEIP-users/Desktop/ChatPDF-main/ChatPDF-main/content
Running on local URL: http://0.0.0.0:7860

To create a public link, set share=True in launch().
2023-05-28 21:02:25.002 | DEBUG | text2vec.sentence_model:init:74 - Use device: cuda
2023-05-28 21:02:28.328 | DEBUG | textgen.chatglm.chatglm_model:init:94 - Device: cuda
No compiled kernel found.
Compiling kernels : /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.c -shared -o /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.so
Load kernel : /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.so
Setting CPU quantization kernel threads to 4
Using quantization cache
Applying quantization to glm layers
Killed

Answer 1 · 2023-05-28T16:30:38.000Z

按提示修复吧，urllib3 or chardet 版本不对吧

Answer 2 · 2023-05-29T23:26:43.000Z

To create a public link, set share=True in launch().
2023-05-30 08:17:40.710 | DEBUG | text2vec.sentence_model:init:74 - Use device: cuda
2023-05-30 08:17:44.559 | DEBUG | textgen.chatglm.chatglm_model:init:94 - Device: cuda
No compiled kernel found.
Compiling kernels : /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.c -shared -o /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.so
Load kernel : /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.so
Setting CPU quantization kernel threads to 4
Using quantization cache
Applying quantization to glm layers
Killed

貌似不是版本问题，我修复了版本警告。感觉是找不到kernel？我看了下我cpu是四核16G的。

Answer 3 · 2023-05-30T03:34:49.000Z

这是chatglm-6b-int4的问题，看下ChatGLM的issue，看能解决不？或者用CPU跑，参数都设置为cpu。