Issues
- 7
Cannot load microsoft/Phi-3-medium and microsoft/Phi-3-small with TGI-2.0.4
#1974 opened by singh-git10 - 1
Expose `ignore_eos_token` to HTTP endpoints
#1944 opened by nathan-az - 1
Support OpenAI's stop parameter logic
#1979 opened by thomas-schillaci - 0
Deberta V3 not supported
#1992 opened by Stealthwriter - 0
Unable to load quantized commandrplus-medusa on H100
#1991 opened by sdadas - 2
TGI does not always preserve order of grammar's JSON keys/Pydantic arguments
#1956 opened by MoritzLaurer - 0
Gemma not starting with tensor parallelism
#1987 opened by arunpatala - 4
- 0
Llama3 Tokenizer Troubles: All added_tokens unrecognized, given id of `None`
#1984 opened by Dtphelan1 - 0
Intel XPU Docker image import error on start
#1983 opened by grafail - 1
memory usage 3x higher than plain code
#1982 opened by pfan94 - 1
Expose `model` argument in python clients
#1978 opened by anubhavrana - 1
Clarification and supplement to the online docs example
#1904 opened by paulcx - 0
[Feature]: Additional metrics to enable better autoscaling / load balancing of TGI servers in Kubernetes
#1977 opened by EandrewJones - 3
Cannot load Gemma models with TGI 2.0.3
#1968 opened by KCFindstr - 1
LlavaNext Model cannot be started
#1914 opened by paulcx - 0
Wrong tool choice makes server crash
#1976 opened by antonioloison - 0
Low-Rank Adaptation of Large Language Models
#1973 opened by mhou7712 - 4
- 0
- 0
Add `response_format` to chat/completions
#1966 opened by thomas-schillaci - 3
Launching Idefics2 QLoRA failing on warmup - shape mismatch: value tensor of shape [64, 4096] cannot be broadcast to indexing result of shape [320, 4096]
#1943 opened by tfcoe - 0
can add model deepseek-ai/DeepSeek-V2-Lite?
#1964 opened by Semihal - 1
ValueError: Unsupported model type llava
#1962 opened by pseudotensor - 2
TGI hard crashes after 1 OOM error
#1960 opened by pranavthombare - 7
Phi-3 medium 128k instruct fails to start
#1930 opened by xfalcox - 0
multiple origins
#1941 opened by LukaszHem - 0
- 0
Can't load local models
#1955 opened by danielkorat - 0
- 4
LoRA Adapter from local model are leading to error
#1893 opened by philschmid - 1
TGI 2.0.2 CodeLlama error `piece id is out of range.`
#1891 opened by philschmid - 2
Phi-3 not starting on TGI 2.0.3 in kubernetes cluster
#1907 opened by Cyb4Black - 0
Falcon 11B VLM Support
#1933 opened by ulrichkr - 1
TGI crash during Warming up model - invalid opcode in rotary_emb.cpython-310-x86_64-linux-gnu.so
#1928 opened by zidsi - 0
- 0
version in docker not correct
#1926 opened by arunpatala - 0
Wrong validations on `Parameters` in TGI python library
#1913 opened by Jason-CKY - 1
error: unexpected argument ‘–max-input-tokens’ found
#1903 opened by moruga123 - 2
Document Request
#1900 opened by oroojlooy - 0
Docs missing for LLaVA NeXT Model
#1905 opened by RonanKMcGovern - 0
- 0
StarCoder2 AWQ does not work correctly
#1899 opened by johan12345 - 2
Min P generation parameter
#1885 opened by LawrenceGrigoryan - 3
Question about KV cache
#1883 opened by martinigoyanes - 0
- 1
- 0
SnapKV support
#1881 opened by icyxp - 0
- 0
Multi-Model Endpoint support in Sagemaker
#1878 opened by Najib-Haq