🧑‍🔬 Tabby Registry

Completion models (--model)

We recommend using

  • For 1B to 3B models, it's advisable to have at least NVIDIA T4, 10 Series, or 20 Series GPUs, or Apple Silicon like the M1.
  • For 7B to 13B models, we recommend using NVIDIA V100, A100, 30 Series, or 40 Series GPUs.

We have published benchmarks for these models on https://leaderboard.tabbyml.com for Tabby's users to consider when making trade-offs between quality, licensing, and model size.

Model ID License
StarCoder-1B BigCode-OpenRAIL-M
StarCoder-3B BigCode-OpenRAIL-M
StarCoder-7B BigCode-OpenRAIL-M
StarCoder2-3B BigCode-OpenRAIL-M
StarCoder2-7B BigCode-OpenRAIL-M
CodeLlama-7B Llama 2
CodeLlama-13B Llama 2
DeepseekCoder-1.3B Deepseek License
DeepseekCoder-6.7B Deepseek License
CodeGemma-2B Gemma License
CodeGemma-7B Gemma License
CodeQwen-7B Tongyi Qianwen License
Codestral-22B Mistral AI Non-Production License
DeepSeek-Coder-V2-Lite Deepseek License

Chat models (--chat-model)

To ensure optimal response quality, and given that latency requirements are not stringent in this scenario, we recommend using a model with at least 1B parameters.

Model ID License
Mistral-7B Apache 2.0
CodeGemma-7B-Instruct Gemma License
Qwen2-1.5B-Instruct Apache 2.0
CodeQwen-7B-Chat Tongyi Qianwen License
Codestral-22B Mistral AI Non-Production License
Yi-Coder-9B-Chat Apache 2.0

Embedding models

Model ID License
Nomic-Embed-Text Apache 2.0
Jina-Embeddings-V2-Code Apache 2.0