support instructBLIP model
Opened this issue · 1 comments
numb3r3 commented
As title, lavis just released a new vision-language instruction-tuning framework using BLIP-2 models, achieving state-of-the-art zero-shot generalization performance on a wide range of vision-language tasks. https://github.com/salesforce/LAVIS/tree/main/projects/instructblip
nomagick commented
Replicate now has a runnable version:
https://replicate.com/joehoover/instructblip-vicuna13b