/mnn-llm

llm deploy project based mnn.

Primary LanguageC++Apache License 2.0Apache-2.0

mnn-llm

mnn-llm

License Download

Read me in english

模型支持

llm模型导出onnx模型请使用llm-export

当前支持以模型:

model onnx-fp32 mnn-int4
chatglm-6b Download Download
chatglm2-6b Download Download
codegeex2-6b Download Download
Qwen-7B-Chat Download Download
Baichuan2-7B-Chat Download Download
Llama-2-7b-chat Download Download

下载int4模型

# <model> like `chatglm-6b`
# linux/macos
./script/download_model.sh <model>

# windows
./script/download_model.ps1 <model>

构建

当前构建状态:

System build-test
Linux Build Status
Macos Build Status
Windows Build Status
Android Build Status

CPU-Only

# linux
./script/linux_build.sh

# macos
./script/macos_build.sh

# windows msvc
./script/windows_build.ps1

# android
./script/android_build.sh

CUDA/OPENCL

TODO

4. 执行

# linux/macos
./cli_demo # cli demo
./web_demo # web ui demo

# windows
.\Debug\cli_demo.exe
.\Debug\web_demo.exe

Reference