/ImageCaptioning

Image Captioning with BLIP model and deployed using Gradio.

Primary LanguagePython

Image Captioning

Image Captioning with BLIP model and deployed using Gradio.

Information

Ho Chi Minh City University of Science

Image and Video Processing Advanced - Assoc. Prof. Lý Quốc Ngọc

K31 - Master of Science - Group: CHOICES

No. Student ID Student Name
1 19127027 Võ Hoàng Bảo Duy
2 19127094 Phạm Ngọc Thiên Ân
3 19127292 Nguyễn Thanh Tình

How to run the deploy

  1. Clone repository.
git clone https://github.com/ngthtinh99/ImageCaptioning.git
  1. Install Python (Python 3.7 - 3.9 is required for supporting Pytorch).
  2. Install necessary libraries.
pip install requests torch torchvision gradio timm fairscale transformers
  1. Run the deploy, the first time downloading the model would take about 5 minutes, the next time would not need to reload.
python app.py
  1. Browse the deploy on Localhost via the link http://localhost:7860, or the Public link generated in Command prompt.
  2. Enjoy 🙂

References

[1] BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation.

[2] Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi, BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation.

[3] Gradio: Build Machine Learning Web Apps — in Python.