YuchenLiu98/COMM
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
MIT
Watchers
- dnth@zenml-io
- dydxdt
- gisbi-kimNAVER LABS
- Haotian-ZhangApple AI/ML
- hikame
- hkf
- isaacperez
- itruonghaiUniversity of Science
- JiazhengChaiJapan
- liuguoyou
- LiWentomngZhejiang University
- lj163ucas
- mu-caiUniversity of Wisconsin - Madison
- pengyulong
- RenShuhuai-AndyPeking University
- wx-bRIOS
- ytaek-ohKAIST
- YuchenLiu98SJTU
- yuw-nvNvidia