921112343
Education: Ph.D. candidate in Shanghai Jiao Tong University Research Interests: Computer Vision Bachelor's Degree: Vehicle Engineering, Tsinghua University
Pinned Repositories
docs
LangRepo
Language Repository for Long Video Understanding
SeeClick
The model, data and code for the visual GUI Agent SeeClick
Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
VideoTree
Code for paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"