example/run_generation.py fails with unexpected argument for TextGenerationStreamingResponse
gpapilion opened this issue · 1 comments
gpapilion commented
System Info
Python 3.10.12
Requirements:
huggingface_hub==0.20.3
requests==2.31.0
datasets==2.18.0
transformers>=4.37.0
Information
- Docker
- The CLI directly
Tasks
- An officially supported command
- My own modifications
Reproduction
- setup a virtual environment and activate it
- navigate to tgi-gaudi/examples
- run python3 run_generation.py against a functional server.
- The program will exit with:
$ python3 run_generation.py --model_id meta-llama/Llama-2-70b-chat-hf --max_concurrent_requests 8 --total_sample_count 1000 --max_output_length 1024 --max_input_length 1024 None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. 0%| | 0/1000 [00:00<?, ?it/s]Thread failed with error: TextGenerationStreamResponse.__init__() got an unexpected keyword argument 'index' $
Expected behavior
Expected behavior is to run the test and produce results, as it does with huggingface-hub==0.23.5.