Ollama JavaScript Library

The Ollama JavaScript library provides the easiest way to integrate your JavaScript project with Ollama.

Getting Started

npm i ollama

Usage

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})
console.log(response.message.content)

Streaming responses

Response streaming can be enabled by setting stream: true, modifying function calls to return an AsyncGenerator where each part is an object in the stream.

import ollama from 'ollama'

const message = { role: 'user', content: 'Why is the sky blue?' }
const response = await ollama.chat({ model: 'llama2', messages: [message], stream: true })
for await (const part of response) {
  process.stdout.write(part.message.content)
}

Create

import ollama from 'ollama'

const modelfile = `
FROM llama2
SYSTEM "You are mario from super mario bros."
`
await ollama.create({ model: 'example', modelfile: modelfile })

API

The Ollama JavaScript library's API is designed around the Ollama REST API

chat

ollama.chat(request)

request <Object>: The request object containing chat parameters.
- model <string> The name of the model to use for the chat.
- messages <Message[]>: Array of message objects representing the chat history.
  - role <string>: The role of the message sender ('user', 'system', or 'assistant').
  - content <string>: The content of the message.
  - images <Uint8Array[] | string[]>: (Optional) Images to be included in the message, either as Uint8Array or base64 encoded strings.
- format <string>: (Optional) Set the expected format of the response (json).
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
- keep_alive <string | number>: (Optional) How long to keep the model loaded.
- options <Options>: (Optional) Options to configure the runtime.
Returns: <ChatResponse>

generate

ollama.generate(request)

request <Object>: The request object containing generate parameters.
- model <string> The name of the model to use for the chat.
- prompt <string>: The prompt to send to the model.
- system <string>: (Optional) Override the model system prompt.
- template <string>: (Optional) Override the model template.
- raw <boolean>: (Optional) Bypass the prompt template and pass the prompt directly to the model.
- images <Uint8Array[] | string[]>: (Optional) Images to be included, either as Uint8Array or base64 encoded strings.
- format <string>: (Optional) Set the expected format of the response (json).
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
- keep_alive <string | number>: (Optional) How long to keep the model loaded.
- options <Options>: (Optional) Options to configure the runtime.
Returns: <GenerateResponse>

pull

ollama.pull(request)

request <Object>: The request object containing pull parameters.
- model <string> The name of the model to pull.
- insecure <boolean>: (Optional) Pull from servers whose identity cannot be verified.
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
Returns: <ProgressResponse>

push

ollama.push(request)

request <Object>: The request object containing push parameters.
- model <string> The name of the model to push.
- insecure <boolean>: (Optional) Push to servers whose identity cannot be verified.
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
Returns: <ProgressResponse>

create

ollama.create(request)

request <Object>: The request object containing create parameters.
- model <string> The name of the model to create.
- path <string>: (Optional) The path to the Modelfile of the model to create.
- modelfile <string>: (Optional) The content of the Modelfile to create.
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
Returns: <ProgressResponse>

delete

ollama.delete(request)

request <Object>: The request object containing delete parameters.
- model <string> The name of the model to delete.
Returns: <StatusResponse>

copy

ollama.copy(request)

request <Object>: The request object containing copy parameters.
- source <string> The name of the model to copy from.
- destination <string> The name of the model to copy to.
Returns: <StatusResponse>

list

ollama.list()

Returns: <ListResponse>

show

ollama.show(request)

request <Object>: The request object containing show parameters.
- model <string> The name of the model to show.
- system <string>: (Optional) Override the model system prompt returned.
- template <string>: (Optional) Override the model template returned.
- options <Options>: (Optional) Options to configure the runtime.
Returns: <ShowResponse>

embeddings

ollama.embeddings(request)

request <Object>: The request object containing embedding parameters.
- model <string> The name of the model used to generate the embeddings.
- prompt <string>: The prompt used to generate the embedding.
- keep_alive <string | number>: (Optional) How long to keep the model loaded.
- options <Options>: (Optional) Options to configure the runtime.
Returns: <EmbeddingsResponse>

Custom client

A custom client can be created with the following fields:

host <string>: (Optional) The Ollama host address. Default: "http://127.0.0.1:11434".
fetch <Object>: (Optional) The fetch library used to make requests to the Ollama host.

import { Ollama } from 'ollama'

const ollama = new Ollama({ host: 'http://localhost:11434' })
const response = await ollama.chat({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})

Building

To build the project files run:

npm run build

ulivz/ollama-js