blip2

There are 24 repositories under blip2 topic.

DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Language:Python2.9k 33 158265
sled-group/chat-with-nerf
Chat with NeRF enables users to interact with a NeRF model by typing in natural language.
Language:Python305 5 1419
mlpc-ucsd/BLIVA
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
Language:Python272 12 2728
gongzix/NeuroClips
Official code base for NeuroClips
Language:MATLAB61 1 02
SmithaUpadhyaya/fashion_image_caption
Automate Fashion Image Captioning using BLIP-2. Automatic generating descriptions of clothes on shopping websites, which can help customers without fashion knowledge to better understand the features (attributes, style, functionality etc.) of the items and increase online sales by enticing more customers.
Language:Jupyter Notebook49 3 28
152334H/MiniGPT-4-discord-bot
A true multimodal LLaMA derivative -- on Discord!
Language:Python44 1 02
eric-ai-lab/ComCLIP
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
Language:Python35 3 03
kyegomez/qformer
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
Language:Python31 3 1
BUAADreamer/SPN4CIR
[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives
Language:Python19 2 82
nngocson2002/ViVQA
The Multimodal Model for Vietnamese Visual Question Answering (ViVQA)
Language:Python12 2 00
ZhaoPeiduo/BLIP2-Japanese
Modifying LAVIS' BLIP2 Q-former with models pretrained on Japanese datasets.
Language:Python12 2 51
matlok-ai/bampe-weights
This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for risk(s).
Language:Python9 3 00
jacobmarks/fiftyone-image-captioning-plugin
Caption images across your datasets with state of the art models from Hugging Face and Replicate!
Language:Python8 3 0
MichiganNLP/visual_diversity_budget
Annotations on a Budget: Leveraging Geo-Data Similarity to Balance Model Performance and Annotation Cost
8 1 00
aws-samples/visual-question-answering-finetuning
Finetuning Large Visual Models on Visual Question Answering
Language:Jupyter Notebook6 1 01
craigsdennis/scairy
Uses AI to scare people...more.
Language:Python4 1 0
leeyunjai/image2text
caption generator using lavis and argostranslate
Language:Python4 1 01
otdavies/AIOrganizeMyDesktop
Too lazy to organize my desktop, make gpt + BLIP-2 do it /s
Language:Python2 1 00
shreyassks/Stylised-Image-Captions-with-RL-PPO
Creating stylish social media captions for an Image using Multi Modal Models and Reinforcement Learning
Language:Jupyter Notebook1 1 00
ergonomech/BLIP-2-Image-Describer
A web-based application that leverages the BLIP-2 model to generate detailed descriptions of uploaded images.
Language:Python00
notslok/Image-Caption-Generator
An end to end Deep Learning based tool for image caption generation.
Language:Jupyter Notebook0 1 00
Pavansomisetty21/Visual-Question-Answering-using-Gemini-LLM
In this we explore into visual Question Answering Using Gemini LLM and image was in URL or any other extension
Language:Jupyter Notebook0 1 00
readygetset/inthon2024
Winning solution for image captioning challenge at 2024 InThon Datathon
Language:Jupyter Notebook0 1 00
thisisiron/QFormer_Pretraining
Implementation of Qformer pre-training
Language:Python1 0

blip2

DAMO-NLP-SG/Video-LLaMA

sled-group/chat-with-nerf

mlpc-ucsd/BLIVA

gongzix/NeuroClips

SmithaUpadhyaya/fashion_image_caption

152334H/MiniGPT-4-discord-bot

eric-ai-lab/ComCLIP

kyegomez/qformer

BUAADreamer/SPN4CIR

nngocson2002/ViVQA

ZhaoPeiduo/BLIP2-Japanese

matlok-ai/bampe-weights

jacobmarks/fiftyone-image-captioning-plugin

MichiganNLP/visual_diversity_budget

aws-samples/visual-question-answering-finetuning

craigsdennis/scairy

leeyunjai/image2text

otdavies/AIOrganizeMyDesktop

shreyassks/Stylised-Image-Captions-with-RL-PPO

ergonomech/BLIP-2-Image-Describer

notslok/Image-Caption-Generator

Pavansomisetty21/Visual-Question-Answering-using-Gemini-LLM

readygetset/inthon2024

thisisiron/QFormer_Pretraining