sophgo/LLM-TPU

chat.cpp:141: void Qwen::init(const std::vector<int>&, std::string): Assertion `true == ret' failed.

Closed this issue · 0 comments

环境:

soc环境
transformers:4.42.4
torch:2.3.1
LLM-TPU:9a744f0/latest 2024.07.23
driver版本:0.5.1

linaro@bm1684:/usr/lib/cmake/libsophon$ bm_version
SophonSDK version: v24.04.01
sophon-soc-libsophon : 0.5.1
sophon-mw-soc-sophon-ffmpeg : 0.10.0
sophon-mw-soc-sophon-opencv : 0.10.0
BL2 v2.7(release):7b2c33d Built : 16:02:07, Jun 24 2024
BL31 v2.7(release):7b2c33d Built : 16:02:07, Jun 24 2024
U-Boot 2022.10 7b2c33d (Jun 24 2024 - 16:01:43 +0800) Sophon BM1684X
KernelVersion : Linux bm1684 5.4.217-bm1684-g27254622663c #1 SMP Mon Jun 24 16:02:21 CST 2024 aarch64 aarch64 aarch64 GNU/Linux
HWVersion: 0x00
MCUVersion: 0x01

路径:

/home/linaro/LLM-TPU/models/Qwen2/python_demo

操作:

python3 pipeline.py --model_path /data/qwen2-7b_int4_seq8192_1dev.bmodel --tokenizer_path ../support/token_config/ --devid 0 --generation_mode greedy

问题

Load ../support/token_config/ ...
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Device [ 0 ] loading ....
[BMRT][bmcpu_setup:498] INFO:cpu_lib 'libcpuop.so' is loaded.
[BMRT][bmcpu_setup:521] INFO:Not able to open libcustomcpuop.so
open usercpu.so, init user_cpu_init
[BMRT][BMProfileDeviceBase:190] INFO:gdma=0, tiu=0, mcu=0
Model[/data/qwen2-7b_int4_seq8192_1dev.bmodel] loading ....
[BMRT][load_bmodel:1939] INFO:Loading bmodel from [/data/qwen2-7b_int4_seq8192_1dev.bmodel]. Thanks for your patience...
[BMRT][load_bmodel:1704] INFO:Bmodel loaded, version 2.2+v1.8.beta.0-221-g902a8a4fe-20240704
[BMRT][load_bmodel:1706] INFO:pre net num: 0, load net num: 61
[BMRT][load_tpu_module:1802] INFO:loading firmare in bmodel
[BMRT][preload_funcs:2121] INFO: core_id=0, multi_fullnet_func_id=22
[BMRT][preload_funcs:2124] INFO: core_id=0, dynamic_fullnet_func_id=23
[bmlib_memory][error] bm_alloc_gmem failed, dev_id = 0, size = 0x7d93000
[BM_CHECK][error] BM_CHECK_RET fail /workspace/libsophon/bmlib/src/bmlib_memory.cpp: bm_malloc_device_byte_heap_mask_u64: 1121
[BMRT][Register:2019] FATAL:coeff alloc failed, size[0x7d93000]
python3: /home/linaro/LLM-TPU/models/Qwen2/python_demo/chat.cpp:141: void Qwen::init(const std::vector&, std::string): Assertion `true == ret' failed.
Aborted

image

之前用sd卡写入驱动之前,跑qwen,跑上面的命令是正常的,但是出现的是逻辑奇怪的重复回答,然后得知驱动低了(当时0.4.9),然后根据教程,刷入驱动(0.5.1)。

发现这次连对话都进不了了,看了其他的issue说可能是bmodel下载的时候Broken了,我就重新下了一次#7 (comment) .还是这样得问题,也跑了首页的./run.sh 跑了qwwn1.5 1b8的,遇到了同样的问题,也重启过服务器

我刷了驱动后面还有什么后续操作要做吗? 我还执行了/home/linaro/bsp-debs/linux-headers-install.sh,安装了相关的依赖。