Serving blip image captioning with BentoML

This project is a blip image captioning service built with BentoML.

Try it on BentoCloud. Deploy Now

It comes with Salesforce/blip-image-captioning-large, you can easily customize it to try other image-to-text models.

🏃 Quick start 🏃

To quickly get started, follow the instructions below or try this tutorial in Google Colab.

1. Install dependencies

pip install -U -q -r requirements.txt

2. Import model

we can download the blip model from huggingface to local BentoML model store using bentoml sdk.

python import_model.py

3. Start Server

We can easily start the server locally with just one single command:

bentoml serve service:svc

🎯 Try it out 🎯

we have the following ways to interact with the server:

1. Web UI

Open http://0.0.0.0:3000 from your browser to send test requests from the Web UI.

2. Raw HTTP

Alternatively, test it with curl command. And we can test the image captioning model under conditional or un-conditional by giving it a prompt input or not.

!curl -X 'POST' \
  'http://127.0.0.1:3000/generate' \
  -H 'accept: text/plain' \
  -H 'Content-Type: multipart/form-data' \
  -F 'img=@three-dog.jpg;type=image/jpeg' \
  -F 'prompt='

!curl -X 'POST' \
  'http://127.0.0.1:3000/generate' \
  -H 'accept: text/plain' \
  -H 'Content-Type: multipart/form-data' \
  -F 'img=@three-dog.jpg;type=image/jpeg' \
  -F 'prompt=this picture is '

3. BentoML client

you can just simply run the following code to get the result.

python client.py

🚀 Production Deployment 🚀

to deploy it to production, we need to build the Bento first. learn more about Bento.

Build and push Bento

> bentoml build

██████╗ ███████╗███╗   ██╗████████╗ ██████╗ ███╗   ███╗██╗
██╔══██╗██╔════╝████╗  ██║╚══██╔══╝██╔═══██╗████╗ ████║██║
██████╔╝█████╗  ██╔██╗ ██║   ██║   ██║   ██║██╔████╔██║██║
██╔══██╗██╔══╝  ██║╚██╗██║   ██║   ██║   ██║██║╚██╔╝██║██║
██████╔╝███████╗██║ ╚████║   ██║   ╚██████╔╝██║ ╚═╝ ██║███████╗
╚═════╝ ╚══════╝╚═╝  ╚═══╝   ╚═╝    ╚═════╝ ╚═╝     ╚═╝╚══════╝

Successfully built Bento(tag="image_captioning-svc:virrs2tnskexaasc").

BentoML provides a number of deployment options. The easiest way to set up a production-ready endpoint of your text embedding service is via BentoCloud, the serverless cloud platform built for BentoML, by the BentoML team.

Next steps:

Sign up for a BentoCloud account here.
Get an API Token, see instructions here.
Push your Bento to BentoCloud: bentoml push image_captioning-svc:virrs2tnskexaasc
Deploy via Web UI, see Deploying on BentoCloud

Community

👉 Join our AI Application Developer community!

xianml/blip-image-captioning-bento