ppl.nn.llm
is a collection of Large Language Models(LLM) inferencing engines based on ppl.nn.
- Linux running on x86_64 or arm64 CPUs
- GCC >= 9.4.0
- CMake >= 3.18
- Git >= 2.7.0
- CUDA Toolkit >= 11.4. 11.6 recommended. (for CUDA)
- Installing prerequisites(on Debian 12 or Ubuntu 20.04 for example):
apt-get install build-essential cmake git
- Cloning source code:
git clone https://github.com/openppl-public/ppl.nn.llm.git
- Building from source:
cd ppl.nn.llm
./build.sh -DPPLNN_USE_LLM_CUDA=ON
Refer to ppl.pmx for how to export onnx models. Refer to APIs in documents of ppl.nn.
This project is distributed under the Apache License, Version 2.0.