/midnights-llm

Instructions to locally deploy LLMs on Midnights

Primary LanguagePython

Midnights-LLM

This repository contains instructions and tools to locally deploy LLMs on the Midnights server.

Setup

The current deployment is primarily based on the vLLM for high-throughtput and memory-efficient inference for LLMs. The main requirements are vLLM, PyTorch, transformers, and openai (for deploying LLMs as a server that mimics the OpenAI API protocol).

Since Midnights' has an older CUDA version, I would recommend using the requirements.txt file to install all dependencies (with CUDA compatbility). Specifically,

echo "export CONDA_HOME=<path_to_conda_installation>" >> ~/.bashrc
source ~/.bashrc
bash setup.sh

To Do

  • Add commands for inference and deployment
  • Add example scripts for inference and deployment