PygmalionAI/aphrodite-engine

PygmalionAI's large-scale inference engine

PythonAGPL-3.0

Issues

[Bug]: Cannot start GGUF FP16 models
#501 opened 23 days ago by Nero10578
4
[Bug]: pip install fails due to incompatible torch 2.3.0
#505 opened 22 days ago by houmie
1
[Feature]: Support [RecurrentGemmaForCausalLM]
#506 opened 22 days ago by sung-ho-moon
3
[Feature]: Add Support for aya-23-8b with GGUF
#504 opened 24 days ago by cnmoro
4
[Feature]: Exllamav2 Q4, Q6, and Q8 cache
#463 opened 2 months ago by Anthonyg5005
3
[Bug]: Moe's no longer working
#484 opened a month ago by puppetm4st3r
3
[Feature]: Suggestion for build older versions of aphrodite engine's docker images
#500 opened a month ago by puppetm4st3r
0
[Installation]: pip installs no executable
#499 opened a month ago by mkesper
3
[Bug]: Docker container refuses connection (read ECONNRESET)
#498 opened a month ago by elabz
0
[Bug]: Cannot load 70b exl2 5bpw model across 4 GPUs.
#471 opened 2 months ago by Ph0rk0z
14
[Bug]: Segmentation fault (core dumped)
#497 opened a month ago by ChuanhongLi
0
[Feature]: Speculative decoding with dual GPUs
#496 opened a month ago by josephrocca
0
[Feature]: WARNING: Model is quantized. Forcing float16 datatype
#487 opened a month ago by sorasoras
4
[Usage]: OOM crash following Offline Inference setup
#494 opened a month ago by eedmond
3
[Bug]: Cannot load llama-3 gguf based models
#473 opened a month ago by EugeoSynthesisThirtyTwo
1
[Bug]: torch._dynamo.exc.BackendCompilerFailed with command-r-plus
#472 opened 2 months ago by heungson
3
[Bug]: SnowStorm-v1.15-4x8B: Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=BROADCAST, NumelIn=128, NumelOut=128, Timeout(ms)=600000)
#493 opened a month ago by josephrocca
0
[Feature]: An alternative to `max_tokens` which defaults to `minimum(max_tokens, remaining_tokens)`
#492 opened a month ago by josephrocca
0
[Bug]: Cannot load Mixtral GGUF model?
#482 opened a month ago by Nero10578
13
[Bug]: /metrics Endpoint Returns 404
#491 opened a month ago by adsf0427
0
[Bug]: [rank0]: KeyError: 'input_ids'
#485 opened a month ago by ChuanhongLi
2
[Bug]: unable use all the vram in wsl cuda environment
#489 opened a month ago by sorasoras
0
[Usage]: Higher Context Length.
#486 opened a month ago by Abulhanan
2
[Misc]: INT8 kv quant seems removed.
#488 opened a month ago by sorasoras
0
[Installation]: Docker runs out of CPU swap size on 8 GPUs. How to lower swap_space to be less than 4GB per GPU?
#483 opened a month ago by elabz
1
[Bug]: Fails to start with error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte
#480 opened a month ago by Nero10578
2
[Bug]: Running aphrodite throws ImportError
#477 opened a month ago by reuank
2
[New Model]: Phi3ForCausalLM
#478 opened a month ago by sparsh35
0
[Feature]: request for support DeepseekV2ForCausalLM.
#476 opened a month ago by kk3dmax
0
[Bug]: Flash attention cannot be used on v0.5.3
#468 opened 2 months ago by Nero10578
7
[Bug]: Int8 k/v cache calibrate don't work with QWen model?
#475 opened a month ago by bash99
0
[Bug]: GPUExecutor throwing 'TypeError: 'type' object is not subscriptable' on 0.5.3
#470 opened 2 months ago by xyzkpsf
2
[Installation]: Upload Aphrodite v0.5.2 On Pypi.org
#451 opened 2 months ago by Abulhanan
3
[Usage]: What to set to get acceptable performance on Pascal GPUs? (Non-P100)
#452 opened 2 months ago by Nero10578
2
[Installation]: Installing from source does not work. undefined symbol: _ZN3c104cuda14ExchangeDeviceEa
#453 opened 2 months ago by Nero10578
8
[Bug]: Unable to use OpenAI API with an auth key via a web browser due to OPTIONS preflight request returning 401.
#434 opened 2 months ago by LostRuins
1
[Bug]:
#435 opened 2 months ago by someoneexistsontheinternet
1
[Bug]: PermissionError: [Errno 13] Permission denied: '/app/aphrodite-engine/.triton'
#458 opened 2 months ago by theobjectivedad
3
[Usage]: Lora Adapter Parameter while inferencing
#464 opened 2 months ago by alokgupta1996
1
[Bug]: LoRA fails to load
#461 opened 2 months ago by kubernetes-bad
1
[Bug]: LoRA broken when TP>1
#460 opened 2 months ago by kubernetes-bad
0
[Installation]: ValueError: 17 is not a valid GGMLQuantizationType
#448 opened 2 months ago by Abulhanan
21
[Performance]: Memory Usage Fix for gguf.
#447 opened 2 months ago by Abulhanan
3
[Bug]: gguf loading failed. config.json?
#417 opened 3 months ago by juud79
4
[Usage]: Please provide the environment variable that closes the KoboldAI Lite page.
#445 opened 2 months ago by online2311
0
[Bug]: Mixtral-8x22b-instruct not running with AWQ
#421 opened 2 months ago by SalomonKisters
10
[Installation]: Cannot install the library
#429 opened 2 months ago by uysalfurkan
0
[Usage]: odd use of GPUS number and tensor parallelism
#426 opened 2 months ago by puppetm4st3r
2
[Feature]: Provide configuration via env vars or a configuration file
#425 opened 2 months ago by alexandreteles
0
[Feature]: Support hqq quantize method.
#418 opened 3 months ago by Minami-su
0