Deploy Replit's Code Completion LM (replit-code-v1-3b) with Baseten

Pre-reqs

  • Set up a Baseten account
  • Clone this repo
  • Install Baseten and Truss client
pip install baseten
pip install truss
  • Auth into Baseten
import baseten
baseten.login("*** BASETEN API KEY ***")

Deploy to Baseten

import baseten
import truss

replitlm_handle = truss.load(".")
baseten.deploy(replitlm_handle, model_name="Replit Code Completion V1 (3B)")

Hardware

This model will deploy to a NVIDIA A10G. Inference time is 3-5 seconds depending on the length of the prompt.