Image-Captioning-Web-App-with-Gemini-Pro-Vision

Introduction

Image captioning has become an essential tool in making content accessible and interactive in digital spaces. With the advent of advanced LLM models like Google’s Gemini Pro Vision, generating captions for images has become more accurate and contextually relevant. In this blog, we will explore how to build a simple web application using Streamlit and Google Google’s Gemini Pro Vision to create a tool that generates captions for uploaded images.

STEPS to run the project:

STEP 01- Clone the repository

Project repo: https://github.com/riad5089/Image-Captioning-Web-App-with-Gemini-Pro-Vision.git

STEP 02-Create a conda environment after opening the repository

python -m venv env

env\Scripts\activate

STEP 03- install the requirements

pip install -r requirements.txt

Project Demo

Deployment

I made a web application using streamlit framework. This web application is hosted in share.streamlit you can check out this app here.