vicuna-tools/vicuna-installation-guide

how to generate a llama.cpp server with fastchat api

xx-zhang opened this issue · 1 comments

I have set the server , but only few words output like blocked , and is a single progress which can't reponse fastly. it is only run and load model when the request is getting.

How did you setup the server?