/rust-llama.cpp

LLama.cpp rust bindings

Primary LanguageC++MIT LicenseMIT

rust_llama.cpp

Docs Crates.io

LLama.cpp rust bindings.

This fork is attempting to create lower level bindings. instead of wrapping well defined c++ abstractions into a set of Rust classes, I plan to bind as much c/c++ as possible with bindgen then use quote and syn crates to generate idiomatic bindings. After this I plan to create a framework using the idiomatic bindings. The goal is to be able to develop feature additions to llama.cpp in Rust including but not limited to custom optimizers like lbfgs. Pull requests are welcome as is collaboration; for now email me at ward.joshua92@yahoo.com.

further oxidation of the camelid may follow..


The rust bindings are mostly based on https://github.com/go-skynet/go-llama.cpp/

Building Locally

Note: This repository uses git submodules to keep track of LLama.cpp.

Clone the repository locally:

git clone --recurse-submodules https://github.com/mdrokz/rust-llama.cpp
cargo build

Usage

[dependencies]
llama_cpp_rs = "0.3.0"
use llama_cpp_rs::{
    options::{ModelOptions, PredictOptions},
    LLama,
};

fn main() {
    let model_options = ModelOptions::default();

    let llama = LLama::new(
        "../wizard-vicuna-13B.ggmlv3.q4_0.bin".into(),
        &model_options,
    )
    .unwrap();

    let predict_options = PredictOptions {
        token_callback: Some(Box::new(|token| {
            println!("token1: {}", token);

            true
        })),
        ..Default::default()
    };

    llama
        .predict(
            "what are the national animals of india".into(),
             predict_options,
        )
        .unwrap();
}

Examples

The examples contain dockerfiles to run them

see examples

TODO

  • Implement support for cublas,openBLAS & OpenCL #7
  • Implement support for GPU (Metal)
  • Add some test cases
  • Support for fetching models through http & S3
  • Sync with latest master & support GGUF
  • Add some proper examples mdrokz#7

LICENSE

MIT