/aikit

🏗️ Fine-tune, build, and deploy open-source LLMs easily!

Primary LanguageGoMIT LicenseMIT

AIKit ✨


AIKit is a comprehensive platform to quickly get started to host, deploy, build and fine-tune large language models (LLMs).

AIKit offers two main capabilities:

  • Inference: AIKit uses LocalAI, which supports a wide range of inference capabilities and formats. LocalAI provides a drop-in replacement REST API that is OpenAI API compatible, so you can use any OpenAI API compatible client, such as Kubectl AI, Chatbot-UI and many more, to send requests to open LLMs!

  • Fine-Tuning: AIKit offers an extensible fine-tuning interface. It supports Unsloth for fast, memory efficient, and easy fine-tuning experience.

👉 For full documentation, please see AIKit website!

Features

Quick Start

You can get started with AIKit quickly on your local machine without a GPU!

docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b

After running this, navigate to http://localhost:8080/chat to access the WebUI!

API

AIKit provides an OpenAI API compatible endpoint, so you can use any OpenAI API compatible client to send requests to open LLMs!

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "llama-3.1-8b-instruct",
    "messages": [{"role": "user", "content": "explain kubernetes in a sentence"}]
  }'

Output should be similar to:

{
  // ...
    "model": "llama-3.1-8b-instruct",
    "choices": [
        {
            "index": 0,
            "finish_reason": "stop",
            "message": {
                "role": "assistant",
                "content": "Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of applications and services, allowing developers to focus on writing code rather than managing infrastructure."
            }
        }
    ],
  // ...
}

That's it! 🎉 API is OpenAI compatible so this is a drop-in replacement for any OpenAI API compatible client.

Pre-made Models

AIKit comes with pre-made models that you can use out-of-the-box!

If it doesn't include a specific model, you can always create your own images, and host in a container registry of your choice!

CPU

Model Optimization Parameters Command Model Name License
🦙 Llama 3.2 Instruct 1B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:1b llama-3.2-1b-instruct Llama
🦙 Llama 3.2 Instruct 3B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:3b llama-3.2-3b-instruct Llama
🦙 Llama 3.1 Instruct 8B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b llama-3.1-8b-instruct Llama
🦙 Llama 3.1 Instruct 70B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:70b llama-3.1-70b-instruct Llama
Ⓜ️ Mixtral Instruct 8x7B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b mixtral-8x7b-instruct Apache
🅿️ Phi 3.5 Instruct 3.8B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8b phi-3.5-3.8b-instruct MIT
🔡 Gemma 2 Instruct 2B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma2:2b gemma-2-2b-instruct Gemma
⌨️ Codestral 0.1 Code 22B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22b codestral-22b MNLP

NVIDIA CUDA

Note

To enable GPU acceleration, please see GPU Acceleration. Please note that only difference between CPU and GPU section is the --gpus all flag in the command to enable GPU acceleration.

Model Optimization Parameters Command Model Name License
🦙 Llama 3.2 Instruct 1B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:1b llama-3.2-1b-instruct Llama
🦙 Llama 3.2 Instruct 3B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:3b llama-3.2-3b-instruct Llama
🦙 Llama 3.1 Instruct 8B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:8b llama-3.1-8b-instruct Llama
🦙 Llama 3.1 Instruct 70B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:70b llama-3.1-70b-instruct Llama
Ⓜ️ Mixtral Instruct 8x7B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b mixtral-8x7b-instruct Apache
🅿️ Phi 3.5 Instruct 3.8B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8b phi-3.5-3.8b-instruct MIT
🔡 Gemma 2 Instruct 2B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma2:2b gemma-2-2b-instruct Gemma
⌨️ Codestral 0.1 Code 22B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22b codestral-22b MNLP
📸 Flux 1 Dev Text to image 12B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/flux1:dev flux-1-dev FLUX.1 [dev] Non-Commercial License

What's next?

👉 For more information and how to fine tune models or create your own images, please see AIKit website!