oracle/graphpipe

handle multiple user request at near time

Opened this issue · 1 comments

Hi, is this serving can handle such a problem?

I think you are asking if the server can handle multiple concurrent requests. Assuming you are referring to the go servers, yes they can. It generally handles two simultaneous requests faster than two sequential requests, although there is a limit depending on the complexity of the model and the backend used. If the cpu or gpu is fully loaded then simultaneous requests could be slower than sending the requests sequentially. Also, keep in mind that it is almost always better to batch requests, especially for the GPU, so sending a single request with multiple rows is usually faster than multiple requests with a single row