sophgo/LLM-TPU

问题太长导致回复到一半就终止了

Closed this issue · 2 comments

我为了提问回答能够更加精准,在提问的时候,给了ChatGLM3一些参考资料,但我发现如果参考资料太长,大概两百多字的话,就会导致回复经常出现只回答一半就终止了,甚至很多时候一句话只说了一半就结束了,这个问题应该怎么解决,我在电脑上跑ChatGLM3原模型时发现似乎并没有这个问题。

it seems your input is reach the maximum input shape of the bmodel, due with the bmodel is a static model, the input shape is limit when convertation, you can convert a large shape like 2048 ... bmodel to solve

Thanks