Issues
- 4
[Bug]: Cannot start GGUF FP16 models
#501 opened by Nero10578 - 1
- 3
[Feature]: Support [RecurrentGemmaForCausalLM]
#506 opened by sung-ho-moon - 4
[Feature]: Add Support for aya-23-8b with GGUF
#504 opened by cnmoro - 3
[Feature]: Exllamav2 Q4, Q6, and Q8 cache
#463 opened by Anthonyg5005 - 3
[Bug]: Moe's no longer working
#484 opened by puppetm4st3r - 0
[Feature]: Suggestion for build older versions of aphrodite engine's docker images
#500 opened by puppetm4st3r - 3
[Installation]: pip installs no executable
#499 opened by mkesper - 0
- 14
[Bug]: Cannot load 70b exl2 5bpw model across 4 GPUs.
#471 opened by Ph0rk0z - 0
[Bug]: Segmentation fault (core dumped)
#497 opened by ChuanhongLi - 0
[Feature]: Speculative decoding with dual GPUs
#496 opened by josephrocca - 4
- 3
[Usage]: OOM crash following Offline Inference setup
#494 opened by eedmond - 1
- 3
- 0
[Bug]: SnowStorm-v1.15-4x8B: Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=BROADCAST, NumelIn=128, NumelOut=128, Timeout(ms)=600000)
#493 opened by josephrocca - 0
[Feature]: An alternative to `max_tokens` which defaults to `minimum(max_tokens, remaining_tokens)`
#492 opened by josephrocca - 13
[Bug]: Cannot load Mixtral GGUF model?
#482 opened by Nero10578 - 0
[Bug]: /metrics Endpoint Returns 404
#491 opened by adsf0427 - 2
[Bug]: [rank0]: KeyError: 'input_ids'
#485 opened by ChuanhongLi - 0
[Bug]: unable use all the vram in wsl cuda environment
#489 opened by sorasoras - 2
[Usage]: Higher Context Length.
#486 opened by Abulhanan - 0
[Misc]: INT8 kv quant seems removed.
#488 opened by sorasoras - 1
[Installation]: Docker runs out of CPU swap size on 8 GPUs. How to lower swap_space to be less than 4GB per GPU?
#483 opened by elabz - 2
[Bug]: Fails to start with error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte
#480 opened by Nero10578 - 2
[Bug]: Running aphrodite throws ImportError
#477 opened by reuank - 0
[New Model]: Phi3ForCausalLM
#478 opened by sparsh35 - 0
[Feature]: request for support DeepseekV2ForCausalLM.
#476 opened by kk3dmax - 7
[Bug]: Flash attention cannot be used on v0.5.3
#468 opened by Nero10578 - 0
- 2
[Bug]: GPUExecutor throwing 'TypeError: 'type' object is not subscriptable' on 0.5.3
#470 opened by xyzkpsf - 3
[Installation]: Upload Aphrodite v0.5.2 On Pypi.org
#451 opened by Abulhanan - 2
[Usage]: What to set to get acceptable performance on Pascal GPUs? (Non-P100)
#452 opened by Nero10578 - 8
[Installation]: Installing from source does not work. undefined symbol: _ZN3c104cuda14ExchangeDeviceEa
#453 opened by Nero10578 - 1
[Bug]: Unable to use OpenAI API with an auth key via a web browser due to OPTIONS preflight request returning 401.
#434 opened by LostRuins - 1
[Bug]:
#435 opened by someoneexistsontheinternet - 3
[Bug]: PermissionError: [Errno 13] Permission denied: '/app/aphrodite-engine/.triton'
#458 opened by theobjectivedad - 1
- 1
[Bug]: LoRA fails to load
#461 opened by kubernetes-bad - 0
[Bug]: LoRA broken when TP>1
#460 opened by kubernetes-bad - 21
- 3
[Performance]: Memory Usage Fix for gguf.
#447 opened by Abulhanan - 4
[Bug]: gguf loading failed. config.json?
#417 opened by juud79 - 0
[Usage]: Please provide the environment variable that closes the KoboldAI Lite page.
#445 opened by online2311 - 10
[Bug]: Mixtral-8x22b-instruct not running with AWQ
#421 opened by SalomonKisters - 0
[Installation]: Cannot install the library
#429 opened by uysalfurkan - 2
- 0
- 0
[Feature]: Support hqq quantize method.
#418 opened by Minami-su