/Efficient-Vision-Language-Pre-training-by-Cluster-Masking

[CVPR 2024] Improving language-visual pretraining efficiency by perform cluster-based masking on images.

Primary LanguagePython

Issues