huggingface/text-generation-inference

Large Language Model Text Generation Inference

PythonApache-2.0

Issues

Cannot load microsoft/Phi-3-medium and microsoft/Phi-3-small with TGI-2.0.4
#1974 opened 11 days ago by singh-git10
7
Expose `ignore_eos_token` to HTTP endpoints
#1944 opened 16 days ago by nathan-az
1
Support OpenAI's stop parameter logic
#1979 opened 10 days ago by thomas-schillaci
1
Deberta V3 not supported
#1992 opened 7 days ago by Stealthwriter
0
Unable to load quantized commandrplus-medusa on H100
#1991 opened 8 days ago by sdadas
0
TGI does not always preserve order of grammar's JSON keys/Pydantic arguments
#1956 opened 13 days ago by MoritzLaurer
2
Gemma not starting with tensor parallelism
#1987 opened 9 days ago by arunpatala
0
TGI 2.0.3 fails to serve CodeLlama models that 2.0.1 supports
#1969 opened 11 days ago by KCFindstr
4
Llama3 Tokenizer Troubles: All added_tokens unrecognized, given id of `None`
#1984 opened 10 days ago by Dtphelan1
0
Intel XPU Docker image import error on start
#1983 opened 10 days ago by grafail
0
memory usage 3x higher than plain code
#1982 opened 10 days ago by pfan94
1
Expose `model` argument in python clients
#1978 opened 10 days ago by anubhavrana
1
Clarification and supplement to the online docs example
#1904 opened 24 days ago by paulcx
1
[Feature]: Additional metrics to enable better autoscaling / load balancing of TGI servers in Kubernetes
#1977 opened 10 days ago by EandrewJones
0
Cannot load Gemma models with TGI 2.0.3
#1968 opened 11 days ago by KCFindstr
3
LlavaNext Model cannot be started
#1914 opened 23 days ago by paulcx
1
Wrong tool choice makes server crash
#1976 opened 11 days ago by antonioloison
0
Low-Rank Adaptation of Large Language Models
#1973 opened 11 days ago by mhou7712
0
ModuleNotFoundError: No module named 'dropout_layer_norm'
#1961 opened 12 days ago by gokerguner
4
The PHI-3 gives warnings about Tokens and returns additional tokens.
#1972 opened 11 days ago by PawelFaron
0
Add `response_format` to chat/completions
#1966 opened 12 days ago by thomas-schillaci
0
Launching Idefics2 QLoRA failing on warmup - shape mismatch: value tensor of shape [64, 4096] cannot be broadcast to indexing result of shape [320, 4096]
#1943 opened 12 days ago by tfcoe
3
can add model deepseek-ai/DeepSeek-V2-Lite?
#1964 opened 12 days ago by Semihal
0
ValueError: Unsupported model type llava
#1962 opened 12 days ago by pseudotensor
1
TGI hard crashes after 1 OOM error
#1960 opened 12 days ago by pranavthombare
2
Phi-3 medium 128k instruct fails to start
#1930 opened 18 days ago by xfalcox
7
multiple origins
#1941 opened 17 days ago by LukaszHem
0
Gibberish generated with deepseek-ai/deepseek-coder-6.7b-instruct
#1957 opened 13 days ago by chuaweien
0
Can't load local models
#1955 opened 13 days ago by danielkorat
0
AttributeError: module 'vllm._C.ops' has no attribute 'moe_align_block_size'
#1945 opened 13 days ago by icyxp
0
LoRA Adapter from local model are leading to error
#1893 opened a month ago by philschmid
4
TGI 2.0.2 CodeLlama error `piece id is out of range.`
#1891 opened 16 days ago by philschmid
1
Phi-3 not starting on TGI 2.0.3 in kubernetes cluster
#1907 opened 24 days ago by Cyb4Black
2
Falcon 11B VLM Support
#1933 opened 18 days ago by ulrichkr
0
TGI crash during Warming up model - invalid opcode in rotary_emb.cpython-310-x86_64-linux-gnu.so
#1928 opened 19 days ago by zidsi
1
Pydantic validation error re: ChoiceDelta (text_generation/types.py)
#1927 opened 19 days ago by mphipps2
0
version in docker not correct
#1926 opened 20 days ago by arunpatala
0
Wrong validations on `Parameters` in TGI python library
#1913 opened 23 days ago by Jason-CKY
0
error: unexpected argument ‘–max-input-tokens’ found
#1903 opened 24 days ago by moruga123
1
Document Request
#1900 opened 24 days ago by oroojlooy
2
Docs missing for LLaVA NeXT Model
#1905 opened 24 days ago by RonanKMcGovern
0
metric: tgi_request_total increments by 2 upon every request
#1901 opened 24 days ago by thenu97
0
StarCoder2 AWQ does not work correctly
#1899 opened 25 days ago by johan12345
0
Min P generation parameter
#1885 opened a month ago by LawrenceGrigoryan
2
Question about KV cache
#1883 opened 25 days ago by martinigoyanes
3
HF web service streaming response differs from OpenAI, breaking clients
#1896 opened 25 days ago by dluc
0
Router /v1/chat/completions not compatible with openai spec
#1887 opened a month ago by phangiabao98
1
SnapKV support
#1881 opened a month ago by icyxp
0
Logging has no formating when using docker enviroment instead of command
#1880 opened a month ago by onel
0
Multi-Model Endpoint support in Sagemaker
#1878 opened a month ago by Najib-Haq
0