Weave AI is a collection of Flux controllers and CLI that manage the lifecycle of Large Language Models (LLMs) on Kubernetes.
Weave AI CLI aims to be the easiest way to onboard LLMs on Kubernetes, and the Weave AI controllers manage the lifecycle of the LLMs, including training, serving, and monitoring of the models on production Kubernetes clusters.
Here's a step-by-step guide to run your first LLM with Weave AI.
Please install Kubernetes v1.27+ and Flux v2.1.0+ before proceeding. Minimum requirements of the Kubernetes cluster are 8 CPUs and 16GB of memory with 100GB of SSD storage.
First, install the Weave AI Command Line Interface (CLI). Use one of the following commands in your terminal:
curl -s https://raw.githubusercontent.com/weave-ai/weave-ai/main/install/weave-ai.sh | sudo bash
You should see information about the download and installation process, confirming successful completion.
[INFO] Downloading metadata https://api.github.com/repos/weave-ai/weave-ai/releases/latest
[INFO] Using 0.11.0 as release
[INFO] Downloading hash https://github.com/weave-ai/weave-ai/releases/download/v0.11.0/weave-ai_0.11.0_checksums.txt
[INFO] Downloading binary https://github.com/weave-ai/weave-ai/releases/download/v0.11.0/weave-ai_0.11.0_linux_amd64.tar.gz
[INFO] Verifying binary download
[INFO] Installing weave-ai to /usr/local/bin/weave-ai
brew install weave-ai/tap/weave-ai
Next, set up a Kubernetes in Docker (KIND) cluster with this command:
kind create cluster
The output will confirm the creation of your cluster, detailing the steps like node image preparation, nodes configuration, and control-plane initiation.
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.27.3) 🖼
✓ Preparing nodes 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Have a question, bug, or feature request? Let us know! https://kind.sigs.k8s.io/#community 🙂
Now, install Flux to manage your cluster resources. Note: You need to disable the default network policy:
flux install --network-policy=false
The output will show the successful installation of various components in the flux-system namespace.
✚ generating manifests
✔ manifests build completed
â–º installing components in flux-system namespace
CustomResourceDefinition/alerts.notification.toolkit.fluxcd.io created
CustomResourceDefinition/buckets.source.toolkit.fluxcd.io created
CustomResourceDefinition/gitrepositories.source.toolkit.fluxcd.io created
CustomResourceDefinition/helmcharts.source.toolkit.fluxcd.io created
CustomResourceDefinition/helmreleases.helm.toolkit.fluxcd.io created
CustomResourceDefinition/helmrepositories.source.toolkit.fluxcd.io created
CustomResourceDefinition/kustomizations.kustomize.toolkit.fluxcd.io created
CustomResourceDefinition/ocirepositories.source.toolkit.fluxcd.io created
CustomResourceDefinition/providers.notification.toolkit.fluxcd.io created
CustomResourceDefinition/receivers.notification.toolkit.fluxcd.io created
Namespace/flux-system created
ResourceQuota/flux-system/critical-pods-flux-system created
ServiceAccount/flux-system/helm-controller created
ServiceAccount/flux-system/kustomize-controller created
ServiceAccount/flux-system/notification-controller created
ServiceAccount/flux-system/source-controller created
ClusterRole/crd-controller-flux-system created
ClusterRole/flux-edit-flux-system created
ClusterRole/flux-view-flux-system created
ClusterRoleBinding/cluster-reconciler-flux-system created
ClusterRoleBinding/crd-controller-flux-system created
Service/flux-system/notification-controller created
Service/flux-system/source-controller created
Service/flux-system/webhook-receiver created
Deployment/flux-system/helm-controller created
Deployment/flux-system/kustomize-controller created
Deployment/flux-system/notification-controller created
Deployment/flux-system/source-controller created
â—Ž verifying installation
✔ helm-controller: deployment ready
✔ kustomize-controller: deployment ready
✔ notification-controller: deployment ready
✔ source-controller: deployment ready
✔ install finished
At this stage, you're ready to install Weave AI and its controller(s):
weave-ai install
After installation, you'll see confirmation messages, indicating that the Weave AI controllers and the default model catalog are set up.
✚ generating manifests
✔ manifests build completed
â–º installing components in weave-ai namespace
CustomResourceDefinition/languagemodels.ai.contrib.fluxcd.io created
Namespace/weave-ai created
Role/default/lm-tenant-role created
Role/weave-ai/lm-leader-election-role created
ClusterRole/lm-manager-role created
RoleBinding/default/lm-tenant-role-binding created
RoleBinding/weave-ai/lm-leader-election-rolebinding created
ClusterRoleBinding/lm-cluster-reconciler created
ClusterRoleBinding/lm-manager-rolebinding created
Deployment/weave-ai/lm-controller created
â—Ž verifying installation
✔ lm-controller: deployment ready
✔ install finished
To view the available models in your cluster, use:
weave-ai models
This command lists all OCI models, which are initially in an INACTIVE
state to conserve resources.
NAME VERSION FAMILY STATUS CREATED
weave-ai/dragon-yi-6b v0.0.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/llama-2-7b-chat v1.0.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/llama-2-7b-instruct-32k v1.0.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/llamaguard-7b v0.1.0-q4km-gguf INACTIVE 1 minute ago
weave-ai/mistral-7b-instruct-v0.1 v0.1.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/mistral-7b-v0.1 v0.1.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/mistrallite-7b v1.0.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/orca-2-7b v1.0.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/stablelm-zephyr-3b v0.1.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/tinyllama-1.1b-chat v0.3.0-q3ks-gguf INACTIVE 1 minute ago
weave-ai/yarn-mistral-7b-128k v0.1.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/zephyr-7b-alpha v1.0.0-q5km-gguf INACTIVE 1 minute ago
weave-ai/zephyr-7b-beta v1.0.0-q5km-gguf INACTIVE 1 minute ago
To activate and run a model, use the following command:
weave-ai run -d --ui --name my-model weave-ai/zephyr-7b-beta
This command activates the model and sets up a UI for interaction. Follow the instructions for port forwarding to access the LLM and the UI.
â–º checking if model weave-ai/zephyr-7b-beta exists and is active
â–º activate model weave-ai/zephyr-7b-beta
â—Ž waiting for model weave-ai/zephyr-7b-beta to be active
â–º creating new LLM instance default/my-model
â—Ž waiting for default/my-model to be ready
â—Ž waiting for default/my-model-chat-app to be ready
✔ to connect to your LLM:
kubectl port-forward -n default svc/my-model 8000:8000
✔ to connect to the UI:
kubectl port-forward -n default deploy/my-model-chat-app 8501:8501
Simply run the UI port-forward command.
kubectl port-forward -n default deploy/my-model-chat-app 8501:8501
Then open your browser to http://localhost:8501
to try the model via our quick chat app.
Finally, to remove the LM instance and the associated UI, execute:
kubectl delete lm/my-model
This command deletes the specified language model instance from your cluster, also with the default chat UI if you've created one.
languagemodel.ai.contrib.fluxcd.io "my-model" deleted