改为使用cpu加载使用gpu加载过的模型后,无法正常对话。
Opened this issue · 3 comments
chinese-wzq commented
报错:
Traceback (most recent call last):
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/gradio/queueing.py", line 388, in call_prediction
output = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/gradio/route_utils.py", line 217, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1553, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1191, in call_function
prediction = await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/gradio/utils.py", line 659, in wrapper
response = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/modules/ui.py", line 175, in __send_message
text, chatbot = self.chat_model.on_message(message, top_p, top_k, temperature, presence_penalty, replace_message)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/modules/chat.py", line 105, in on_message
out, model_tokens, model_state = self.model_utils.run_rnn(model_tokens, model_state, self.model_utils.pipeline.encode(new))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/modules/model_utils.py", line 39, in run_rnn
out, model_state = self.model.forward(tokens[:self.CHUNK_LEN], model_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/rwkv/model.py", line 1138, in forward
x, state[i*3+0], state[i*3+1] = ATT(
^^^^
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "/home/example/Build/RWKV_Role_Playing/venv/lib/python3.11/site-packages/rwkv/model.py", line 789, in att_seq_v5_2
def att_seq_v5_2(self, x, sx, s, ln_w, ln_b, lx_w, lx_b, k_mix, v_mix, r_mix, g_mix, t_decay, t_first, kw, vw, rw, gw, ow, kmx, krx, kmy, kry, vmx, vrx, vmy, vry, rmx, rrx, rmy, rry, gmx, grx, gmy, gry, omx, orx, omy, ory):
xx = F.layer_norm(x, (x.shape[-1],), weight=ln_w, bias=ln_b)
sx = torch.cat((sx.unsqueeze(0), xx[:-1,:]))
~~~~~~~~~ <--- HERE
kx = xx * k_mix + sx * (1 - k_mix)
vx = xx * v_mix + sx * (1 - v_mix)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument tensors in method wrapper_CUDA_cat)
BlinkDL commented
更新 rwkv pip package 版本即可
chinese-wzq commented
找到原因了。更换模型之后,需要删除save/
文件夹,程序需要加入检测save
文件夹下的缓存文件是否符合当前配置再加载
这应该被看作是程序的缺陷吧?因此issue保持开启,直到此功能加入。
shengxia commented
这个……在readme中有提及,不过关于检测sav文件是否适合这点,得让我先琢磨明白怎么检测。