English / 简体中文
Android Pinyin IME for port of Facebook's LLaMA model in C/C++
Demo:
demo.mp4
llama-jni implements further encapsulation of common functions in llama.cpp with JNI
, enabling direct use of large language models (LLM) stored locally in mobile applications on Android devices.
llama-pinyinIME
is a typical use case of llama-jni.
By adding an input field component to the Google Pinyin IME
, llama-pinyinIME
provides a localized AI-assisted input service based on a LLM without the need for internet connectivity.
The goals of llama-pinyinIME
include:
- Developing new features for the
Google Pinyin IME
so that users can enter tasks that require a LLM to complete in the input field of the IME. After submission, users can observe the streaming printout of the results in the actual target input position. - Calling the built-in Prompt feature in the input field to allow users to quickly switch and use Prompt text stored in local files for performing tasks such as translation, grammar correction, and general Q&A.
- Supporting input mode switching in the input field. In addition to general input and local LLM input,
llama-pinyinIME
also provides an internet-assisted input mode based on the Openai API and supports custom local Prompt text. - Providing support for various models and input parameters related to llama.cpp in the application settings of
llama-pinyinIME
. - Optimizing necessary adaptation features in the
Google Pinyin IME
, such as Chinese input, cursor control, candidate words, etc.
llama-pinyinIME
supports the creation of an Android application package (APK)
. The packaging process requires the support of NDK and CMake. The relevant tool configuration information is as follows.
apply plugin: 'com.android.application'
android {
// 和 libs/android.jar 保持一致
compileSdkVersion 25
buildToolsVersion "26.0.2"
aaptOptions {
noCompress 'dat'
}
defaultConfig {
minSdkVersion 23
// 和 libs/android.jar 保持一致
// noinspection ExpiredTargetSdkVersion
targetSdkVersion 25
versionCode 1
versionName "1.0.0"
applicationId "com.sx.llama.pinyinime"
externalNativeBuild {
cmake {
cppFlags "-Wall"
abiFilters 'armeabi-v7a', 'arm64-v8a', 'x86', 'x86_64'
}
}
}
buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.txt'
}
}
externalNativeBuild {
cmake {
path 'src/main/cpp/CMakeLists.txt'
version '3.22.1'
}
}
lintOptions {
checkReleaseBuilds false
// Or, if you prefer, you can continue to check for errors in release builds,
// but continue the build even when errors are found:
abortOnError false
}
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_8
targetCompatibility JavaVersion.VERSION_1_8
}
android.applicationVariants.all { variant ->
variant.outputs.all {
outputFileName = "ST_Pinyin_V${defaultConfig.versionName}.apk"
}
}
}
dependencies {
testImplementation 'junit:junit:4.13.2'
androidTestImplementation 'androidx.test.ext:junit:1.1.5'
implementation 'com.alibaba:fastjson:1.2.83'
implementation 'cz.msebera.android:httpclient:4.5.8'
provided files('libs/android.jar')
}
llama-pinyinIME
does not contain model files. Please prepare your own LLM, which need to be supported by the specified version of llama.cpp.
The necessary LLM (e.g. GPT4All) need to be stored in a dedicated folder for mobile applications on the Android external storage device, assuming the path is
/storage/emulated/0/Android/data/com.sx.llama.pinyinime/ggml-vic7b-q5_0.bin
Then the following piece of code in PinyinIME.java needs to correspond to its filename.
private String modelName = "ggml-vic7b-q5_0.bin";
The dedicated folder also needs to store the necessary Prompt text file (e.g. 1.txt), assuming the path is
/storage/emulated/0/Android/data/com.sx.llama.pinyinime/1.txt
If you need to use the Openai API for networking, the following code in PinyinIME.java needs to be filled with the available API Key
before creating the APK
:
private String openaiKey = "YOUR_OPENAI_API_KEY";
Select AVD in Android Studio
and click on the Run icon . After the necessary system settings, users can quickly adjust the running mode of llama-pinyinIME
with the button on the left side of the input field.
- Note: The subsequent demonstration is based on a virtual simulator with 12GB of RAM, while the test results based on real physical devices show that the inference speed of the existing hardware is far from the application standard.
llama-pinyinIME
can only be used as a validation prototype of the technical route, and the completion level is for reference only.
The general usage of llama-pinyinIME
is similar to other IME on Android devices, which also support Chinese, English and punctuation input, and the mobile application built on this basis can also be installed and used on real physical devices.
mode-close.mp4
Based on the Prompt text file stored in a folder dedicated to mobile applications on the Android external storage device, the user simply enters the text content in the llama-pinyinIME
input field preceded by its filename + space
(default 1.txt if nothing is added), clicks the submit icon on the far left and is ready to use.
In fact, users can customize as many Prompt text files as they want to cope with a variety of input reasoning tasks and scenarios by calling the equivalent llama.cpp command after the corresponding file name as
./main -m "/storage/emulated/0/Android/data/com.sx.llama.pinyinime/ggml-vic7b-q5_0.bin" -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f "/storage/emulated/0/Android/data/com.sx.llama.pinyinime/1.txt"
./main -m "/storage/emulated/0/Android/data/com.sx.llama.pinyinime/ggml-vic7b-q5_0.bin" -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f "/storage/emulated/0/Android/data/com.sx.llama.pinyinime/2.txt"
Taking the syntax correction task of 1.txt and the translation task of 2.txt and as an example, the actual printing effect of llama-pinyinIME
is as follows
mode-local-1.mp4
mode-local-2.mp4
This mode has reached a usable level of response due to the text inference left to the Openai API, and also supports direct calls to local Prompt text files.
mode-cloud-zh.mp4
mode-cloud-en.mp4
- LLaMA — Inference code for LLaMA models.
- llama.cpp — Port of Facebook's LLaMA model in C/C++.
- llama-jni — Android JNI for port of Facebook's LLaMA model in C/C++.
Feel free to dive in! Open an issue or submit PRs.
This project exists thanks to all the people who contribute.
MIT © shixiangcap