model name is not consistent across endpoints
Closed this issue · 1 comments
bufferoverflow commented
Feature request
Add a --served-model-name
option to control the model name.
Motivation
I ran:
docker run -p 8080:8080 michaelf34/infinity:latest --model-name-or-path BAAI/bge-m3 --port 8080
Query the models endpoint:
$ curl -s http://0.0.0.0:8080/models | jq
{
"data": [
{
"id": "BAAI/bge-m3",
"stats": {
"queue_fraction": 0,
"queue_absolute": 0,
"results_pending": 0,
"batch_size": 32
},
"object": "model",
"owned_by": "infinity",
"created": 1711612054,
"backend": "torch"
}
],
"object": "list"
}
Query the embeddeings endpoint:
$ curl -s -X 'POST' 'http://0.0.0.0:8080/embeddings' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"input": [
"string"
]}' | jq | grep model
"model": "BAAIbge-m3",
Via embeddings endpoint the model is BAAIbge-m3
for the model endpoint it is BAAI/bge-m3
, Somehow would be cool to control the name.
vLLM is doing this e.g. with the following options:
--served-model-name
: The model name used in the API. If not specified, the model name will be the same as the huggingface name.--model
: name or path of the huggingface model to use
Your contribution
I can create a PR for this
michaelfeil commented
Sounds useful to me, would be great to PR it.
You can make it an EngineArg
, since its closely coupled with the model. You might name it model-display-name
, which defaults to None
. Hoping someone PRs #13 so that might be better compatible then.