How can I get the model to run on vLLM?

Question

How can I get the model to run on vLLM?

xnoname79 opened this issue 10 months ago · 2 comments

Thank you for publishing the project.

I would like to test the model on my local computer using compatible, supported OpenAI APIs, and I see that vLLM is the appropriate project to make it happen.

I would appreciate some advice on making changes and getting the code compatible to run on vLLM.
I truly appreciate your help.

Answer 1 · 2023-12-03T07:01:36.000Z

I assume you have already given it a try with the VLLM instruction here: https://docs.vllm.ai/en/latest/getting_started/quickstart.html
What went wrong?

Answer 2 · 2023-12-03T09:04:39.000Z

Sorry @datquocnguyen, I'm new to this

You're correct; there's nothing wrong when running the model with vLLM. At first glance, I thought the project was built with a completely new architecture that was not yet supported in vLLM. After taking a closer look at the code and familiarizing myself with some terms, I realized that it was built on top of MPT, and the architecture is indeed supported in vLLM.

Thank you for your response; we can consider the thread closed here. I'm still in the learning process, so I genuinely appreciate your corrections on any misunderstood concepts.