michaelfeil/infinity

model name is not consistent across endpoints

Closed this issue · 1 comments

Feature request

Add a --served-model-name option to control the model name.

Motivation

I ran:

docker run -p 8080:8080 michaelf34/infinity:latest --model-name-or-path BAAI/bge-m3 --port 8080

Query the models endpoint:

$ curl -s http://0.0.0.0:8080/models | jq
{
  "data": [
    {
      "id": "BAAI/bge-m3",
      "stats": {
        "queue_fraction": 0,
        "queue_absolute": 0,
        "results_pending": 0,
        "batch_size": 32
      },
      "object": "model",
      "owned_by": "infinity",
      "created": 1711612054,
      "backend": "torch"
    }
  ],
  "object": "list"
}

Query the embeddeings endpoint:

$ curl -s -X 'POST'   'http://0.0.0.0:8080/embeddings'   -H 'accept: application/json'   -H 'Content-Type: application/json'   -d '{
  "input": [
    "string"
  ]}' | jq | grep model
  "model": "BAAIbge-m3",

Via embeddings endpoint the model is BAAIbge-m3 for the model endpoint it is BAAI/bge-m3, Somehow would be cool to control the name.

vLLM is doing this e.g. with the following options:

  • --served-model-name: The model name used in the API. If not specified, the model name will be the same as the huggingface name.
  • --model: name or path of the huggingface model to use

Your contribution

I can create a PR for this

Sounds useful to me, would be great to PR it.

You can make it an EngineArg, since its closely coupled with the model. You might name it model-display-name, which defaults to None. Hoping someone PRs #13 so that might be better compatible then.