Unsupported model type llava_next

Question

Unsupported model type llava_next

Spycsh opened this issue 6 months ago · 5 comments

System Info

Use the ghcr.io/huggingface/tgi-gaudi:2.0.1 official docker image.

As shown in https://github.com/huggingface/tgi-gaudi/blob/habana-main/docs/source/supported_models.md?plain=1, llava-hf/llava-v1.6-mistral-7b-hf should be supported but it appears not to be supported.

2024-07-16T04:35:51.631996Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:

/usr/local/lib/python3.10/dist-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
  warnings.warn(
Traceback (most recent call last):

  File "/usr/local/bin/text-generation-server", line 8, in <module>
    sys.exit(app())

  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/cli.py", line 137, in serve
    server.serve(

  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/server.py", line 223, in serve
    asyncio.run(

  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)

  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()

  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/server.py", line 189, in serve_inner
    model = get_model(

  File "/usr/local/lib/python3.10/dist-packages/text_generation_server/models/__init__.py", line 109, in get_model
    raise ValueError(f"Unsupported model type {model_type}")

ValueError: Unsupported model type llava_next
 rank=0
2024-07-16T04:35:51.728344Z ERROR text_generation_launcher: Shard 0 failed to start
2024-07-16T04:35:51.728370Z  INFO text_generation_launcher: Shutting down shards

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

Both of following tries fail with the same "Model unsupported" error above

export model=llava-hf/llava-v1.6-mistral-7b-hf
export volume=$PWD/data

docker run -p 8080:80 -v $volume:/data --runtime=habana -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.1 --model-id $model --max-input-tokens 1024 --max-total-tokens 2048

git clone https://github.com/huggingface/tgi-gaudi.git
cd tgi-gaudi
docker build --build-arg http_proxy=${http_proxy} --build-arg https_proxy=${http_proxy} -t tgi_gaudi_llava .

docker run -p 8080:80 -v $volume:/data --runtime=habana -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host tgi_gaudi_llava --model-id $model --max-input-tokens 1024 --max-total-tokens 2048

Expected behavior

Since the README claims that llava_next is supported, users should be able to use that model

Answer 1 · 2024-07-16T07:00:46.000Z

Seems the doc is simply the fork of TEI so the README does not fully apply to TEI Gaudi.

Answer 2 · 2024-07-16T09:09:54.000Z

@Spycsh you are right, configurations that are officially supported in this fork are pointed out in main README

Answer 3 · 2024-07-24T15:47:59.000Z

@Spycsh @kdamaszk Thanks for bringing up this issue. We have also experienced this as a blocker, and we have a need to run llava in TGI-Gaudi. Right now we have TGI-Gaudi running various LLMs in our product, but we have to use a workaround for llava, as we hit this issue. Given that Optimum Habana now officially supports llava_next, it would be great to get this in this TGI fork. See, for example: https://github.com/search?q=repo%3Ahuggingface%2Foptimum-habana+llava&type=pullrequests

Is there any active work ongoing to support llava in this TGI fork? If so, great. If not, maybe I could drum up some interest in a contribution (either within our company or with our partners).

Answer 4 · 2024-07-24T15:50:51.000Z

Ah, I'm just seeing this: #193. Not sure how I missed that.

Answer 5 · 2024-08-26T09:06:04.000Z

Llava-next support was added in #187, closing this issue.