EricLBuehler/mistral.rs

Blazingly fast LLM inference.

RustMIT

Pinned issues

Model Wishlist

#156 opened 9 months ago by EricLBuehler

Open95

Tracking: Metal performance vs. MLX, llama.cpp

#903 opened 2 months ago by EricLBuehler

Open6

Issues

0.3.4 #992 - #998 doesn't build
#999 opened 19 days ago by misureaudio
0
error[E0599]: no method named `is_none_or` found for enum `std::option::Option` in the current scope
#995 opened 20 days ago by Dead-Bytes
3
KV Cache Quantization
#971 opened a month ago by dinerburger
7
How do I finetune/train models with this?
#980 opened a month ago by Tameflame
0
create_ordering.py not supported with llama 3 loras
#976 opened a month ago by kkailaasa
1
[Feature Request] -- EfficientQAT (Omniquant Successor) and/or ISTA-DASLab Higgs Quant. Models/Formatting
#977 opened a month ago by BuildBackBuehler
1
0.3.4 #967 cargo install fails
#969 opened a month ago by misureaudio
0
0.3.2 #891 build failure on Windows 11
#896 opened a month ago by misureaudio
8
Couldnt run any vision model
#935 opened a month ago by GraphicalDot
17
fast-forward tokens with llguidance
#965 opened a month ago by mmoskal
0
parallel computation of mask in constrained sampling
#964 opened a month ago by mmoskal
0
rejection sampling for `top_p` etc
#963 opened a month ago by mmoskal
0
Confusion around loading a GGUF locally
#922 opened 2 months ago by mojadem
2
Possible problem with candle 0.8.0 - doesn't build on a GTX1650 (CI 75) nor a GTX1070 (CI 61)
#954 opened a month ago by misureaudio
2
Build on ubuntu 24.04 with src/cast.cu
#951 opened a month ago by mostlygeek
5
Failed to parse Cargo.toml: [workspace] missing field `package`
#883 opened a month ago by helix84
1
DiffusionArchitecture not found in python package
#943 opened a month ago by Manojbhat09
5
mistral-server with n>1 only returns one result
#955 opened a month ago by mmoskal
2
phi3 output garbage on master
#956 opened a month ago by mmoskal
2
is_streaming: true gives unreachable code panic
#953 opened a month ago by dancixx
1
Multi Image an Multi Prompt issue using Mistral.rs
#853 opened 3 months ago by kuladeephx
7
Create and load standalone quantized UQFF models
#947 opened a month ago by FishiaT
5
Flash Attention not building
#941 opened a month ago by Aveline67
13
Integrating Mistral.rs with Swiftide
#843 opened 3 months ago by timonv
3
Error: DriverError(CUDA_ERROR_INVALID_PTX, "a PTX JIT compilation failed") when loading utanh_bf16
#850 opened 3 months ago by nikolaydubina
7
0.3.1 #862 new build failure, stop at mistralrs-quant
#866 opened a month ago by misureaudio
14
Docker Build Failure: mistralrs-quant Fails with "No such file or directory" Error
#893 opened a month ago by ShivamSphn
10
Vision interactive mode for gguf models
#882 opened a month ago by ShaheerANative
2
Error: Enable to run Lora - Adapter files are empty
#929 opened 2 months ago by kkailaasa
1
Tracking: Metal performance vs. MLX, llama.cpp
#903 opened 2 months ago by EricLBuehler
6
Mistral.rs server build error
#918 opened 2 months ago by grpathak22
1
Mistral.rs server build error
#915 opened 2 months ago by grpathak22
2
Speculative decoding support for mistralrs-server
#912 opened 2 months ago by PkmX
0
Add support for Ministral-8B-Instruct-2410
#897 opened 2 months ago by maralski
2
Question about UQFF
#836 opened 3 months ago by schnapper79
5
Add gemma2 architecture support for GGUF
#901 opened 2 months ago by grpathak22
0
Memory leak and channel closure issues when reusing/dropping Model
#865 opened 3 months ago by solaoi
1
How to free memory usage
#886 opened 2 months ago by jiabochao
4
Text Completion/Raw Input support?
#890 opened 2 months ago by oofdere
1
Prompting in interactive mode and specifying different images reuses the first image
#868 opened 3 months ago by beaugunderson
2
How can I use two NVIDIA RTX 4090 GPUs with mistral.rs?
#892 opened 2 months ago by ricesin888
1
#832 - Unable to build on Windows with VS and CUDA 12.6
#847 opened 3 months ago by misureaudio
6
Feature Request 「plz support InternLM2.5」
#876 opened 3 months ago by boshallen
0
CUDA_ERROR_UNSUPPORTED_PTX_VERSION on Jetson AGX Orin
#867 opened 3 months ago by bmgxyz
2
Compiled wheels on PyPI would be really useful
#864 opened 3 months ago by simonw
2
support qwen2 gguf architecture
#851 opened 3 months ago by franklucky001
4
load qwen2.5 model locally failed
#852 opened 3 months ago by franklucky001
5
Build --features metal falis with "pattern `DType::F8E4M3` not covered"
#845 opened 3 months ago by maslovw
3
How to keep context to chat with the model ?
#839 opened 3 months ago by Aveline67
0
run program with multiple GPU
#834 opened 3 months ago by schnapper79
1