menloresearch/cortex.llamacpp
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
C++AGPL-3.0
Issues
- 1
feat: [DESCRIPTION] support mac vulkan
#403 opened by alsyundawy - 0
- 1
bug: could not stop streaming inference
#313 opened by vansangpfiev - 0
feat: llamacpp README.md file
#139 opened by irfanpena - 5
feat: CI for VNNI support
#60 opened by vansangpfiev - 0
chore: fix macos codesign ci cortex.llamacpp
#322 opened by hiento09 - 3
feat: Nightly build release name should map with commit ID from upstream repo llama.cpp
#303 opened by hiento09 - 1
chore: setup macos-12 selfhosted runner
#317 opened by hiento09 - 6
feat: [support log prob like OpenAI API]
#262 opened by nguyenhoangthuan99 - 1
- 3
feat: [support return multiple choices]
#264 opened by nguyenhoangthuan99 - 1
- 1
feat: [Windows] CI for AMD RoCM support
#9 opened by hiro-v - 1
- 1
Discussion: Review cuda build flags
#24 opened by vansangpfiev - 1
- 0
feat: instructions detecting
#20 opened by vansangpfiev - 1
- 2
Discussion: batching benchmark and improvement
#164 opened by vansangpfiev - 0
- 0
- 1
- 0
feat: get running models
#33 opened by vansangpfiev - 0
- 1
feat: CI
#1 opened by vansangpfiev - 0
feat: example to run backend
#2 opened by vansangpfiev - 0
- 0