LeapfrogAI CTransformers Backend

Description

A LeapfrogAI API-compatible CTransformers wrapper for quantized model inferencing.

Usage

See instructions to get the backend up and running. Then, use the LeapfrogAI API server to interact with the backend.

Instructions

The instructions in this section assume the following:

Properly installed and configured Python 3.11.x, to include its development tools
Installed wget
The LeapfrogAI API server is deployed and running

Local Development

For cloning a model locally and running the development backend.

# Clone Model
make fetch-model

# Setup Python Virtual Environment
make create-venv
make activate-venv
make requirements-dev

# Start Model Backend
make dev

Docker Container

Image Build and Run

For local image building and running.

# Build the docker image
docker build -t ghcr.io/defenseunicorns/leapfrogai/ctransformers:latest-cpu .

# Run the docker container
docker run -p 50051:50051 -v ./config.yaml:/leapfrogai/config.yaml ghcr.io/defenseunicorns/leapfrogai/ctransformers:latest-cpu

For pulling a tagged image from the main release repository.

Where <IMAGE_TAG> is the released packages found here.

# Download and run remote image
docker run -p 50051:50051 -v ./config.yaml:/leapfrogai/config.yaml ghcr.io/defenseunicorns/leapfrogai/ctransformers:<IMAGE_TAG>

GPU Inferencing