Pinned issues
Issues
- 0
0.3.4 #992 - #998 doesn't build
#999 opened by misureaudio - 3
error[E0599]: no method named `is_none_or` found for enum `std::option::Option` in the current scope
#995 opened by Dead-Bytes - 7
KV Cache Quantization
#971 opened by dinerburger - 0
How do I finetune/train models with this?
#980 opened by Tameflame - 1
create_ordering.py not supported with llama 3 loras
#976 opened by kkailaasa - 1
[Feature Request] -- EfficientQAT (Omniquant Successor) and/or ISTA-DASLab Higgs Quant. Models/Formatting
#977 opened by BuildBackBuehler - 0
0.3.4 #967 cargo install fails
#969 opened by misureaudio - 8
0.3.2 #891 build failure on Windows 11
#896 opened by misureaudio - 17
Couldnt run any vision model
#935 opened by GraphicalDot - 0
fast-forward tokens with llguidance
#965 opened by mmoskal - 0
parallel computation of mask in constrained sampling
#964 opened by mmoskal - 0
rejection sampling for `top_p` etc
#963 opened by mmoskal - 2
Confusion around loading a GGUF locally
#922 opened by mojadem - 2
Possible problem with candle 0.8.0 - doesn't build on a GTX1650 (CI 75) nor a GTX1070 (CI 61)
#954 opened by misureaudio - 5
Build on ubuntu 24.04 with src/cast.cu
#951 opened by mostlygeek - 1
- 5
DiffusionArchitecture not found in python package
#943 opened by Manojbhat09 - 2
mistral-server with n>1 only returns one result
#955 opened by mmoskal - 2
phi3 output garbage on master
#956 opened by mmoskal - 1
is_streaming: true gives unreachable code panic
#953 opened by dancixx - 7
Multi Image an Multi Prompt issue using Mistral.rs
#853 opened by kuladeephx - 5
Create and load standalone quantized UQFF models
#947 opened by FishiaT - 13
Flash Attention not building
#941 opened by Aveline67 - 3
Integrating Mistral.rs with Swiftide
#843 opened by timonv - 7
Error: DriverError(CUDA_ERROR_INVALID_PTX, "a PTX JIT compilation failed") when loading utanh_bf16
#850 opened by nikolaydubina - 14
- 10
Docker Build Failure: mistralrs-quant Fails with "No such file or directory" Error
#893 opened by ShivamSphn - 2
Vision interactive mode for gguf models
#882 opened by ShaheerANative - 1
Error: Enable to run Lora - Adapter files are empty
#929 opened by kkailaasa - 6
Tracking: Metal performance vs. MLX, llama.cpp
#903 opened by EricLBuehler - 1
Mistral.rs server build error
#918 opened by grpathak22 - 2
Mistral.rs server build error
#915 opened by grpathak22 - 0
Speculative decoding support for mistralrs-server
#912 opened by PkmX - 2
Add support for Ministral-8B-Instruct-2410
#897 opened by maralski - 5
Question about UQFF
#836 opened by schnapper79 - 0
Add gemma2 architecture support for GGUF
#901 opened by grpathak22 - 1
- 4
How to free memory usage
#886 opened by jiabochao - 1
Text Completion/Raw Input support?
#890 opened by oofdere - 2
Prompting in interactive mode and specifying different images reuses the first image
#868 opened by beaugunderson - 1
- 6
- 0
Feature Request 「plz support InternLM2.5」
#876 opened by boshallen - 2
CUDA_ERROR_UNSUPPORTED_PTX_VERSION on Jetson AGX Orin
#867 opened by bmgxyz - 2
Compiled wheels on PyPI would be really useful
#864 opened by simonw - 4
support qwen2 gguf architecture
#851 opened by franklucky001 - 5
load qwen2.5 model locally failed
#852 opened by franklucky001 - 3
- 0
How to keep context to chat with the model ?
#839 opened by Aveline67 - 1
run program with multiple GPU
#834 opened by schnapper79