pip install baseten
pip install truss
- Auth into Baseten
import baseten
baseten.login("*** BASETEN API KEY ***")
import baseten
import truss
replitlm_handle = truss.load(".")
baseten.deploy(replitlm_handle, model_name="Replit Code Completion V1 (3B)")
This model will deploy to a NVIDIA A10G. Inference time is 3-5 seconds depending on the length of the prompt.