huggingface/tgi-gaudi

Large Language Model Text Generation Inference on Habana Gaudi

PythonApache-2.0

Issues

GPTQ uint4 quantization broken
#207 opened 5 months ago by endomorphosis
2
llama3.1-70B-instruct 422 error Template error: unknown test: test iterable is unknown (in <string>:99)
#218 opened 4 months ago by minmin-intel
2
Generation stopped too early without hitting stop condition
#223 opened 3 months ago by minmin-intel
7
Incorrect answer with openai compatible penalty parameters
#238 opened 3 months ago by Spycsh
0
Integration of llama3.1 fixes
#197 opened 3 months ago by Feelas
17
When running llama2 7b, inference some 2k length prompt concurrently will cause TGI service crash.
#216 opened 4 months ago by yao531441
2
Unsupported model type llava_next
#186 opened 4 months ago by Spycsh
5
LlaVa support
#149 opened 4 months ago by JoeyTPChou
3
example/run_generation.py fails with unexpected argument for TextGenerationStreamingResponse
#189 opened 5 months ago by gpapilion
1
Best Performance for a single card for Llama-2-7b-chat-hf
#196 opened 5 months ago by AdityaKulshrestha
0
Misleading documentation
#174 opened 6 months ago by 12010486
4
setting token flags still results in console warning
#195 opened 5 months ago by endomorphosis
0
https://github.com/huggingface/tgi-gaudi/pull/176 causes performance regression for benchmark
#184 opened 6 months ago by mandy-li
3
ValueError: Unsupported model type t5
#172 opened 6 months ago by JunxiChhen
3
low throughput while using TGI-Gaudi on bigcode/starcoderbase-3b on Gaudi2
#166 opened 6 months ago by vishnumadhu365
1
Integrate critical PR from TGI uptream
#155 opened 6 months ago by luoyiroy
4
Clarification on past_key_values type for Starcoder
#116 opened 7 months ago by vidyasiv
3
TGI Performance script run_generation.py missing Throughput Info
#148 opened 7 months ago by jingkang99
3
v2.0.0-release: 8 extra tokens appended to input tokens trigger in trigger the huggingface_hub.errors.ValidationError: Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32512. Given: 32008 `inputs` tokens and 512 `max_new_tokens`. however no issue in v1.2.2-release
#146 opened 7 months ago by IT-Forrest
1
Any hint or doc to list all APIs supported in tgi-gaudi?
#132 opened 8 months ago by Ruoyu-y
4
How to use FP8 feature in TGI-gaudi
#95 opened 8 months ago by lvliang-intel
1
update the base image from 1.14 to 1.15
#127 opened 8 months ago by yafshar
4
HPUGraph destructor issue when installing dill
#130 opened 8 months ago by yafshar
6
cannot start docker! - neither 1 Gaudi card nor 8 Gaudi cards works
#93 opened 9 months ago by jingkang99
9
Unable to load local model from the directory for TGI Gaudi 1.2 version
#43 opened 9 months ago by avinashkarani
15
Docker build issue
#97 opened 10 months ago by akarX23
3
Issue running meta-llama/Llama-2-13b-chat-hf
#54 opened 10 months ago by muhammad-asn
32
Cargo error clap
#47 opened 10 months ago by muhammad-asn
1