how to generate a llama.cpp server with fastchat api

Question

how to generate a llama.cpp server with fastchat api

xx-zhang opened this issue a year ago · 1 comments

I have set the server , but only few words output like blocked , and is a single progress which can't reponse fastly. it is only run and load model when the request is getting.

Answer 1 · 2023-06-27T23:52:16.000Z

How did you setup the server?