/mnn-llm

llm deploy project based mnn.

Primary LanguageC++Apache License 2.0Apache-2.0

mnn-llm

mnn-llm

License Download

Read me in english

模型支持

llm模型导出onnx模型请使用llm-export

当前支持以模型:

model onnx-fp32 mnn-int4
chatglm-6b Download Download
chatglm2-6b Download Download
codegeex2-6b Download Download
Qwen-7B-Chat Download Download
Baichuan2-7B-Chat Download Download
Llama-2-7b-chat Download Download

下载int4模型

# <model> like `chatglm-6b`
# linux/macos
./script/download_model.sh <model>

# windows
./script/download_model.ps1 <model>

构建

当前构建状态:

System Build Statud
Linux Build Status
Macos Build Status
Windows Build Status
Android Build Status

本地编译

# linux
./script/linux_build.sh

# macos
./script/macos_build.sh

# windows msvc
./script/windows_build.ps1

# android
./script/android_build.sh

默认使用CPU后端,如果使用其他后端,可以在脚本中添加MNN编译宏

  • cuda: -DMNN_CUDA=ON
  • opencl: -DMNN_OPENCL=ON

4. 执行

# linux/macos
./cli_demo # cli demo
./web_demo # web ui demo

# windows
.\Debug\cli_demo.exe
.\Debug\web_demo.exe

# android
adb push libs/*.so build/libllm.so build/cli_demo /data/local/tmp
adb push model_dir /data/local/tmp
adb shell "cd /data/local/tmp && export LD_LIBRARY_PATH=. && ./cli_demo -m model"

Reference