/DiCO

Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization (BMVC 2024 Oral ✨)

Primary LanguagePython

Watchers