This repo contains the code for the server that runs the Galactica model for text generation. It is based on the VLLM project, and soon I will add support for the Hugging Face Text Generation Inference codebase.
Compatible with all Galactica models from here and all models supported by the VLLM project.
Use the GalacTex Chrome Extension to use the server with Overleaf.
Download the model from here using:
huggingface-cli download --repo-type model --local-dir "<LOCAL_MODEL_REPO>/facebook--galactica-6.7b" --local-dir-use-symlinks False "facebook/galactica-6.7b"
cd vllm
docker build -t vllm:0.2.2 -f Dockerfile .
./start_server_docker.sh <MODEL_PATH> <PORT>
Follow the instructions on the VLLM Docs to install the VLLM server. Run the VLLM server with:
./run_api_server.sh <MODEL_PATH>
Then go to the extension setting and set the server endpoint to http://localhost:<PORT>/api/generate
.
This is a quick and dirty project to make writing papers with code completion easier. Born out of procrastination and the need to quickly write a paper during my PhD.
Wissam Antoun: Linkedin | Twitter | Github | wissam.antoun (AT) gmail (DOT) com | wissam.antoun (AT) inria (DOT) fr