vision-and-language-pre-training

There are 9 repositories under vision-and-language-pre-training topic.

salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Language:Jupyter Notebook4.7k 34 195621
OFA-Sys/Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Language:Python4.4k 34 327452
phellonchen/awesome-Vision-and-Language-Pre-training
Recent Advances in Vision and Language Pre-training (VLP)
287 11 315
zhjohnchan/awesome-vision-and-language-pretraining
A curated list of vision-and-language pre-training (VLP). :-)
56 3 07
mala-lab/SIC-CADS
Code Implementation of "Simple Image-level Classification Improves Open-vocabulary Object Detection" (AAAI'24)
Language:Python21 1 93
PrithivirajDamodaran/vision-language-modelling-series
Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations
Language:Jupyter Notebook14 2 04
JianqiangWan/VLPT-STD
Vision-Language Pre-Training for Boosting Scene Text Detectors (CVPR2022)
11 5 50
marialymperaiou/knowledge-enhanced-multimodal-learning
A list of research papers on knowledge-enhanced multimodal learning
7 1 00
SHTUPLUS/GITM-MR
The official implementation for the ICCV 2023 paper "Grounded Image Text Matching with Mismatched Relation Reasoning".
Language:Python6 3 00