ggml-org/llama.cpp

LLM inference in C/C++

C++MIT

Pinned issues

changelog : `libllama` API

#9289 opened 6 months ago by ggerganov

Open5

changelog : `llama-server` REST API

#9291 opened 6 months ago by ggerganov

Open12

examples : add configuration presets

#10932 opened 3 months ago by ggerganov

Open3

Issues

Misc. bug: The KV cache is sometimes truncated incorrectly when making v1/chat/completions API calls
#11970 opened 18 days ago by vnicolici
41
Misc. bug: convert_hf_to_gguf failed
#11991 opened 11 days ago by JSXGQ
3
Regression. Unable to run any model. CRASH!!!
#12075 opened 11 days ago by acbits
9
Eval bug: does llama.cpp support Intel AMX instruction? how to enable it
#12003 opened 17 days ago by montagetao
6
Worse performance on newer Linux kernel version
#12086 opened 11 days ago by cxxr2
1
Misc. bug: llama-cli llama_backend_free may not free all the gpu memory
#12057 opened 13 days ago by GaoXiangYa
3
Compile bug: How to build llama.android example with -DGGML_VULKAN=ON through android studio.
#12085 opened 12 days ago by gaykawadpk
0
Misc. bug: Loop range computation question of Vulkan matmul shaders
#12082 opened 12 days ago by blurSong
1
Eval bug: MUSA error: operation not supported
#12077 opened 12 days ago by yeungtuzi
3
Eval bug: getting assertion error when trying to use a gguf quantized model at inference "GGML_ASSERT(n_outputs_enc > 0 && "call llama_encode() first") failed"
#12080 opened 12 days ago by Vedapani0402
0
Misc. bug: Concurrency Limitation: Only 6 Inferences Run Simultaneously When Setting `--parallel` > 6
#12013 opened 12 days ago by karanotsingyu
6
Eval bug: TikTokenTokenizer has no attribute vocab
#12044 opened 14 days ago by zhanghui-china
5
Eval bug: Several models producing gibberish
#12012 opened 12 days ago by iamangus
15
Eval bug: granite-vision-3.1-2b-preview ERROR:hf-to-gguf:Model LlavaNextForConditionalGeneration is not supported
#12053 opened 13 days ago by gnusupport
2
Misc. bug: Web-UI now unusably slow - over network or locally.
#12026 opened 16 days ago by clort81
5
Eval bug: llama.cpp:8910: GGML_ASSERT(strcmp(embd->name, "result_norm") == 0 && "missing result_output tensor") failed
#12074 opened 12 days ago by 79154gb
0
Eval bug: std::filesystem::__cxx11::filesystem_error
#11962 opened 18 days ago by gnusupport
2
Eval bug: CANNOT LINK EXECUTABLE "./llama-cli": library "libomp.so" not found: needed by main executable
#11979 opened 17 days ago by Krallbe68
6
Misc. bug: ggml-backend.cpp:746: pre-allocated tensor (cache_k_l0 (view) (copy of cache_k_l0 (view))) in a buffer (Vulkan0) that cannot run the operation (CPY)
#12045 opened 14 days ago by simonchen
1
Eval bug: GGML_ASSERT(hparams.n_embd_head_k % ggml_blck_size(type_k) == 0) failed
#12033 opened 13 days ago by AbdullahMPrograms
2
Misc. bug: Rpc-server does not use opencl backend on Android.
#11957 opened 19 days ago by belog2867
6
Misc. bug: cannot scroll to right side when input too long
#12054 opened 13 days ago by gnusupport
0
Eval bug: Error when converting moonlight from bf16 to q4km
#12040 opened 15 days ago by qiyuxinlin
2
Misc. bug: --no-context-shift OR --context-shift ?
#12038 opened 15 days ago by simonchen
1
Compile bug: llama.cpp-b4749/ggml/src/ggml-cpu/ggml-cpu-quants.c:5141:26: error: initialization of ‘uint32_t *’ {aka ‘unsigned int *’} from incompatible pointer type ‘const uint8_t (*)[12]’ {aka ‘const unsigned char (*)[12]’} [-Wincompatible-pointer-types]
#12050 opened 14 days ago by Arniiiii
0
Eval bug: context shift is disabled
#11974 opened 18 days ago by deific
3
Eval bug: CPU usage is abnormal when running deepseek-r1-671B-Q4_0 weights in Atlas 800T a2 and NPU device。
#11966 opened 18 days ago by woshidahunzi1
2
llama-cli misbehaving (changed?)
#12036 opened 15 days ago by 0wwafa
2
Feature Request: 推理minicpmv时，encoding_image_with_clip耗时很久
#11941 opened 20 days ago by EnzhiZhou
4
Misc. bug: llama-cli '--log-disable' parameter omits response
#11983 opened 17 days ago by nmandic78
2
Misc. bug: add tool_calls id in response in server
#11992 opened 17 days ago by henryclw
1
Eval bug: unknown pre-tokenizer type: 'deepseek-r1-qwen'
#12021 opened 16 days ago by wr131
4
Misc. bug: llama-run segmentation fault
#12022 opened 15 days ago by benoitf
8
[Feature]: SOC_VERSION ascend310b1 does not support
#11978 opened 15 days ago by Cikaros
0
Add option to build CUDA backend without Flash attention
#11946 opened 15 days ago by slaren
2
[CANN] Compile bug: no matching function for call to 'CastIntrinsicsImpl' Ascend NPU issues specific to Ascend NPUs
#12010 opened 16 days ago by Cikaros
3
GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84)
#11976 opened 18 days ago by chokoon123
0
Maybe it would better to have a diagram to show how llama.cpp process inferences
#11967 opened 17 days ago by yinuu
2
Misc. bug: `json_schema` under `response_format` is not working on OpenAI compatible API endpoint `v1/chat/completions`
#11988 opened 17 days ago by henryclw
1
Feature Request: add Kernel level verbose option
#11985 opened 17 days ago by 0400H
0
Eval bug: Unexpected empty grammar stack after accepting piece: <｜tool_calls_begin｜> on DeepSeek-R1-Distill-Qwen-32B
#11938 opened 20 days ago by chgjin
1
tensor 'blk.25.ffn_down.weight' has invalid ggml type 42 (NONE)
#11975 opened 18 days ago by evaninf
0
Misc. bug: Sporadic MUL_MAT Failures in test-backend-ops for Nvidia backend
#11972 opened 18 days ago by ShanoToni
1
Eval bug: Ram boom after using llama-bench with cuda12.8 and deepseekr1q6
#11965 opened 18 days ago by Xxianna
0
bamba
#11955 opened 19 days ago by werruww
0
Misc. bug: Segmentation fault when importing model to opencl buffer
#11953 opened 19 days ago by zhouzengming
0
Eval bug: llama.cpp Incorrectly Parses and Reports sprintf Calls in C++ Code
#11951 opened 19 days ago by perdubug
0
Misc. bug: hipGraph causes a crash in hipGraphDestroy
#11949 opened 19 days ago by IMbackK
0
Eval bug: Segmentation fault with Docker ROCm image "full-rocm"
#11947 opened 19 days ago by JFingerle
0
Feature Request: dynamic speculation (i.e. dynamic draft-max)
#11933 opened 20 days ago by fredlas
2