jina-ai/rungpt

support instructBLIP model

Opened this issue · 1 comments

As title, lavis just released a new vision-language instruction-tuning framework using BLIP-2 models, achieving state-of-the-art zero-shot generalization performance on a wide range of vision-language tasks. https://github.com/salesforce/LAVIS/tree/main/projects/instructblip

Replicate now has a runnable version:
https://replicate.com/joehoover/instructblip-vicuna13b