janhq/cortex.llamacpp
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
C++AGPL-3.0
Issues
- 0
- 1
- 2
Discussion: batching benchmark and improvement
#164 opened by vansangpfiev - 0
- 0
feat: Revamp the README.md file
#139 opened by irfanpena - 0
feat: Return logits_prob in `chat_completion`
#135 opened by hiro-v - 0
- 1
- 0
- 0
feat: get running models
#33 opened by vansangpfiev - 0
feat: CI for VNNI support
#60 opened by vansangpfiev - 0
Discussion: Review cuda build flags
#24 opened by vansangpfiev - 0
feat: instructions detecting
#20 opened by vansangpfiev - 0
- 1
feat: CI
#1 opened by vansangpfiev - 0
feat: example to run backend
#2 opened by vansangpfiev - 0
- 0
- 0
feat: [Windows] CI for AMD RoCM support
#9 opened by hiro-v