huggingface/tgi-gaudi

example/run_generation.py fails with unexpected argument for TextGenerationStreamingResponse

gpapilion opened this issue · 1 comments

System Info

Python 3.10.12

Requirements:
huggingface_hub==0.20.3
requests==2.31.0
datasets==2.18.0
transformers>=4.37.0

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

  1. setup a virtual environment and activate it
  2. navigate to tgi-gaudi/examples
  3. run python3 run_generation.py against a functional server.
  4. The program will exit with:
    $ python3 run_generation.py --model_id meta-llama/Llama-2-70b-chat-hf --max_concurrent_requests 8 --total_sample_count 1000 --max_output_length 1024 --max_input_length 1024 None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. 0%| | 0/1000 [00:00<?, ?it/s]Thread failed with error: TextGenerationStreamResponse.__init__() got an unexpected keyword argument 'index' $

Expected behavior

Expected behavior is to run the test and produce results, as it does with huggingface-hub==0.23.5.

@regisss @libinta I added a PR to address this.