simon-ging/coot-videotext
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
PythonApache-2.0
Stargazers
- x2ssShanghai
- wss321Chengdu
- radioactive11New Delhi,India
- josecohenca
- Xiaolong-han
- choukhaNorway
- sherylke
- LuoweiZhou
- fuqichen1998Seattle, WA
- forenceBeijing
- ttengwangHong Kong
- bryant1410Ann Arbor, MI, USA
- liujiahengBeijing
- PKULiuHui
- lzlzlizi
- zhangwanqian
- AAAves
- huaiwenInner Mongolia University
- haoshuai714
- eagle1983
- luxuyang6
- wanganzhiChengDu
- TramacBeijing, China
- Mollylulu
- houzhijian
- Flowerhwang
- junchen14Saudi Arabia
- Chuhanxx
- PipiZong
- JiaxinZhuangGuangzhou, China
- kevinlee9Shanghai, China
- cjiang2Edmonton
- scotthavirdMarietta, GA
- bbking-fly
- TuringKiChina, Chengdu
- forest520