abetlen/llama-cpp-python

Python bindings for llama.cpp

PythonMIT

Pinned issues

Improve installation process

#1178 opened 10 months ago by abetlen

Open8

Roadmap for v0.2

#487 opened a year ago by abetlen

Open0

Add batched inference

#771 opened a year ago by abetlen

Open37

Issues

server chat/completion api fails - coroutine object not callable in llama_proxy
#1857 opened 16 days ago by PurnaChandraPanda
3
Add an option to enable --runtime-repack in llama.cpp
#1860 opened 15 days ago by ekcrisp
1
Request to publish CUDA builds for v0.3.2 to GitHub Releases
#1854 opened 19 days ago by lsorber
1
Error when passing model to deepcopy in llama_cpp_python>=0.3.0
#1769 opened 19 days ago by sergey21000
3
"tool_calls" not returning on native http request on a llama cpp server
#1856 opened 17 days ago by celsowm
0
llama_get_logits_ith: invalid logits id -1, reason: no logits
#1855 opened 17 days ago by devashishraj
0
server: chat completions returns wrong logprobs model
#1787 opened 19 days ago by domdomegg
1
Assistant message with tool_calls and without content raises an error
#1805 opened 19 days ago by feloy
1
Release the GIL
#1836 opened 19 days ago by simonw
3
Add reranking support
#1794 opened 2 months ago by donguyen32
5
llama_get_logits_ith: invalid logits id -1, reason: no logits
#1812 opened 2 months ago by ba0gu0
2
Intel GPU not enabled when using -DLLAVA_BUILD=OFF
#1851 opened 22 days ago by dnoliver
0
Windows with Intel GPU fails to build if Ninja is not the selected backend
#1852 opened 22 days ago by dnoliver
0
With Intel GPU on Windows, llama_perf_context_print reports invalid performance metrics
#1853 opened 22 days ago by dnoliver
0
"eval time" and "prompt eval time" is 0.00ms after Ver0.3.0
#1830 opened a month ago by nai-kon
1
Error when building wheels on Linux FileNotFoundError: [Errno 2] No such file or directory: 'ninja'`
#1839 opened a month ago by sabaimran
2
Mistral-instruct not using system prompt.
#1832 opened a month ago by AkiraRy
2
[Feature request] High-level API support for DRY and XTC samplers
#1813 opened 2 months ago by ddh0
3
[INFORMATION REQUEST] Is it possible to build for GPU enabled target on non-GPU host?
#1841 opened a month ago by m-o-leary
0
Prebuilt cuda wheel has a GLIBC version mismatch on Ubuntu 20.04.
#1840 opened a month ago by Moon-404
0
Request for prebuilt CUDA wheels for newer version
#1824 opened 2 months ago by XJF2332
7
Error when update to 0.3.2
#1837 opened a month ago by paoloski97
0
Add support of Qwen2vl
#1811 opened 2 months ago by PredyDaddy
2
Update related llama.cpp to support Intel AMX instruction
#1827 opened a month ago by nai-kon
1
Setting seed to -1 (random) or using default LLAMA_DEFAULT_SEED generates a deterministic reply chain
#1809 opened 2 months ago by m-from-space
1
updated from 0.2.90 to 0.3.2 and now my GPU won't load
#1835 opened a month ago by rookiemann
1
save logits section in eval() sets dtype to np32 apparently unconditionally?
#1829 opened a month ago by robbiemu
1
Installed everything but speed is lower than on 3090 compared with industrial GPU. Seems like cuda is not working.
#1815 opened a month ago by lukaLLM
3
llama-cpp-python 0.3.1 didn't use GPU(
#1785 opened 2 months ago by artyomboyko
15
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='nul' mode='w' encoding='cp932'>
#1828 opened a month ago by AkiraRy
0
llama-server not using GPU
#1826 opened 2 months ago by RakshitAralimatti
0
Specify GPU Selection (e.g., CUDA:0, CUDA:1)
#1816 opened 2 months ago by RakshitAralimatti
4
Prebuilt CUDA wheels not working
#1822 opened 2 months ago by mjwweb
2
llama-cpp-python not using GPU on google colab
#1780 opened 3 months ago by AnirudhJM24
3
AttributeError: function 'llama_sampler_init_tail_free' not found after compiling llama.pcc with hipBLAS
#1818 opened 2 months ago by Micromanner
2
Unable to Use GPU with llama-cpp-python on Jetson Orin
#1779 opened 3 months ago by watchstep
2
top_p = 1 causes deterministic outputs
#1797 opened 2 months ago by oobabooga
2
Unable to pip install
#1804 opened 2 months ago by chinthasaicharan
0
low level examples broken after [feat: Update sampling API for llama.cpp (#1742)]
#1803 opened 2 months ago by mite51
0
`Llama.from_pretrained` should work with `HF_HUB_OFFLINE=1`
#1801 opened 2 months ago by davidgilbertson
0
Long Context Generation Crashes Google Colab Instance
#1792 opened 2 months ago by kazunator
0
Can't install with Vulkan support in Ubuntu 24.04
#1789 opened 3 months ago by wannaphong
1
Tool parser cannot analysis tool calls string from qwen2.5.
#1784 opened 3 months ago by hpx502766238
0
Enable tool calling in openai compatible server for all other models.
#1772 opened 3 months ago by hpx502766238
0
Why is this not working for the current release. UNABLE TO USE GPU
#1781 opened 3 months ago by AnirudhJM24
1
_logger.py: KeyError:5 [bugfix] [patch]
#1778 opened 3 months ago by themanyone
0
Speculative decoding gives weird results in v. 0.3
#1770 opened 3 months ago by mobeetle
2
Setting temperature to 100000000000000000 does not affect output.
#1773 opened 3 months ago by ivanstepanovftw
2
Missing async llm call
#1774 opened 3 months ago by ivanstepanovftw
0
[FEAT]: TLS Certificate Support
#1768 opened 3 months ago by isgallagher
0