awslabs/multi-model-server

Update documentation to establish the difference between backend time and backend response time

sachanub opened this issue · 0 comments

There has been some confusion recently regarding the differences between backend time and backend response time. We need to update the documentation or add some comments which highlight the differences between these two.

Backend response time: Time taken by a backend worker process to handle a request from the frontend worker thread.
Backend time: Total time taken from when a client request was scheduled by the frontend worker thread to the time when it responds to the client request. This is inclusive of the backend response time above.