Atome-FE/llama-node

[Error: Could not load model] { code: 'GenericFailure' }

ralyodio opened this issue · 8 comments

Getting this error [Error: Could not load model] { code: 'GenericFailure' } when trying to load a model:

$ node ./bin/llm/llm.js --model ~/models/gpt4-alpaca-lora-30B.ggml.q5_1.bin
[Error: Could not load model] { code: 'GenericFailure' }

I've modified the example a bit to take an argument as --model

import minimist from 'minimist';
import { LLM } from "llama-node";
import { LLamaRS } from "llama-node/dist/llm/llama-rs.js";
import path from "path";
const args = minimist(process.argv.slice(2));
const modelPath = args.model;
const model = path.resolve(modelPath);
const llama = new LLM(LLamaRS);
const template = `how are you`;
const prompt = `Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:

${template}

### Response:`;

const params = {
    prompt,
    numPredict: 128,
    temp: 0.2,
    topP: 1,
    topK: 40,
    repeatPenalty: 1,
    repeatLastN: 64,
    seed: 0,
    feedPrompt: true,
};
const run = async () => {
    try {
        await llama.load({ path: model });
        await llama.createCompletion(params, (response) => {
            process.stdout.write(response.token);
        });
    } catch (err) {
        console.error(err);
    }
};

run();

try using the llama.cpp backend, i think it supports more model types than llm-rs

q5_1 may be supported later. I have not upgrade llm-rs backend for it

my models work fine with llm-rs

@hlhr202 what is q5_1 mean?

try using the llama.cpp backend, i think it supports more model types than llm-rs

how do i do this?

it is a type of ggml model, you can check it on llama.cpp github

try using the llama.cpp backend, i think it supports more model types than llm-rs

how do i do this?

Check it here: https://llama-node.vercel.app/docs/backends/
and
here: https://llama-node.vercel.app/docs/backends/llama.cpp/inference

I think this issue also need to investigate the llama.cpp lora support. but I m still reading the llama.cpp implementation. probably will bring this feature soon.

@ralyodio by now, q5_1 model is supported for llama.cpp backend here