/llama-k8s

Kubernetes LLM Interface

Primary LanguagePython

llama-k8s

This repository serves as a demonstration of running a 70B token LLM in a Knative service on CoreWeave. It is meant mostly for educational purposes, and is not supposed to represent "best practices" for deploying LLMs on Knative. Everything here is provided as-is with no guarantee of functionality.