/text2text-inference-service

Text2Text Inference Service for AI App Store (with pretrained models from HuggingFace)

Primary LanguagePython

FLAN-T5

Text2Text Generation

Build

To build the docker container, run

make build

Push to Registry

To push the image to a registry, first build the image, then run

docker tag flan-t5:1.0.0 <REGISTRY>/<REPO>/flan-t5:1.0.0

If not logged in to the registry, run

docker login -u <USERNAME> -p <PASSWORD> <REGISTRY>

Then, push the tagged image to a registry

docker push <REGISTRY>/<REPO>/flan-t5:1.0.0

Run Locally

To run the Gradio application locally, run the following

make dev

Deploy

First, make sure your image is pushed to the registry.

Deployment on AI App Store

Check out the AI App Store documentation for full details, but in general:

  1. Create/edit a model card
  2. Pass the docker image URI (e.g <REGISTRY>/<REPO>/flan-t5:1.0.0) when creating/editing the inference service

Other Deployment Options

There are other potential deployment options, including:

  • Google Cloud Run
  • AWS Fargate
  • Red Hat Openshift Serverless