Pinned issues
Issues
- 2
Phi 3.5 vision (4B model)
#637 opened by CheeseAndMeat - 0
Not able to run source code
#636 opened by nirvitarka - 4
Quickstart example not working
#489 opened by jmorenobl - 3
flashinfer backend raises RuntimeError: paged_kv_indices must be a 1D tensor
#625 opened by baggiponte - 8
Flash Attention is not installed?
#595 opened by ObliviousDonkey - 1
RuntimeError: CUDA error: no kernel image is available for execution on the device
#535 opened by nethi - 0
Performance issues on AWQ and Lora
#611 opened by dumbPy - 1
Issue with loading AWQ quantized Llama 3.1 70B
#607 opened by dumbPy - 1
Can not start lorax from docker
#605 opened by korlin0110 - 0
Running several adapters on the same input
#606 opened by arnaud-secondlayer - 0
seems like when max total token is so huge like 130000, and in the request if there is no max new token the response will be wrong
#601 opened by ejiang-eog - 0
- 7
- 8
Issues loading Llama 3.1 8B Instruct
#592 opened by jonseaberg - 0
The server is failing to run
#591 opened by u650080 - 1
if LoRAX is based on punica kernels will it be able to support LoRA Adapters for Mistral NeMO 12B?
#549 opened by tensimixt - 3
- 1
- 0
- 1
- 2
docker image error
#556 opened by ejiang-eog - 0
LORAX_USE_GLOBAL_HF_TOKEN is not applied at the first time of calling adapter from huggingface private hub
#541 opened by monologg - 0
Stop word is included on phi-2
#537 opened by yunmanger1 - 7
Fails hard on CUDA error
#523 opened by yunmanger1 - 0
Hi Guys, now that TGI is back under Apache-2.0 license, will lorax merge their updates?
#527 opened by SMAntony - 0
Adding Whisper model
#526 opened by Jeevi10 - 2
Generating garbage output
#521 opened by shreyansh26 - 0
Add echo parameter in request
#518 opened by dennisrall - 1
- 0
can't start my local llama3 model server with docker
#511 opened by cheney369 - 0
Fail to load special token in phi-3
#505 opened by prd-tuong-nguyen - 1
can't run lorax with docker.
#502 opened by cheney369 - 9
Fail to run Phi-3
#485 opened by prd-tuong-nguyen - 0
- 4
- 2
AutoTokenzier.from_pretrains needs setting with `trust_remote_code` inside `load_module_map`
#466 opened by thincal - 0
Quantized KV Cache
#483 opened by flozi00 - 2
- 4
- 3
- 1
Bug Report: lorax-launcher failed with --source "s3" for model_id "mistralai/Mistral-7B-Instruct-v0.2"
#473 opened by donjing - 0
Support inference on INF2 instance
#477 opened by prd-tuong-nguyen - 0
Reject unknown fields from API requests
#478 opened by noyoshi - 1
Add HTTP status codes to docs
#481 opened by noyoshi - 0
- 1
[QUESTION] How to change HuggingFace model download Path in Lorax When deployed to Kubernetes through HelmChart
#470 opened by fahimkm - 6
- 0
Add all launcher args as optional in the Helm charts
#465 opened by tgaddair - 0
- 0
Batch inference endpoint (OpenAI compatible)
#448 opened by tgaddair