yangxiao2

yangxiao2's Stars

UCSC-VLAA/CLIPS
An Enhanced CLIP Framework for Learning with Synthetic Captions
Language:Python251
microsoft/LLM2CLIP
LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.
Language:Python43719
google-research-datasets/conceptual-captions
Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image captioning systems.
Language:Shell52426
Alibaba-NLP/OmniSearch
Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
Language:Python19712
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10.1k984
StevenGrove/LearnableTreeFilterV2
Language:Python919
Hazqeel09/ellzaf_ml
Bridging Research and Practice with PyTorch
Language:Python726
QinYang79/Awesome-Noisy-Correspondence
This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our emails: linyijie.gm@gmail.com yangmouxing@gmail.com qinyang.gm@gmail.com
498
amusi/ECCV2024-Papers-with-Code
ECCV 2024 论文和开源项目合集，同时欢迎各位大佬提交issue，分享ECCV 2024论文和开源项目
2.1k267
ErgastiAlex/MARS
Language:Python182
happylinze/multi-modal-tsm
The official repository for the Multi-Modal Visual Pattern Recognition Challenge-Track3 Baseline (ICPR 2024)
Language:Python51
XU-TIANYANG/ZOO145
145 video sequences with 20 categories from the zoo, forming a new test set for animal tracking, coined ZOO145
3
taosdata/TDengine
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
Language:C23.5k4.9k
zengwang430521/TCFormer
The codes for TCFormer in paper: Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer
Language:Python22721
ArrowLuo/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Language:Python901125
jihoo-kim/RecSys-Papers-from-SIGIR-2021
Papers related to the Recommender System from SIGIR 2021 (including the links for Paper PDF, Github Code and Dataset)
244
CryhanFang/CLIP2Video
Language:Python23628
starmemda/CAMoE
Language:Python9710
dair-ai/ml-visuals
🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.
14k1.4k
JingweiJ/ActionGenome
A video database bridging human actions and human-object relationships
Language:Python13217
yrcong/STTran
Spatial-Temporal Transformer for Dynamic Scene Graph Generation, ICCV2021
Language:Jupyter Notebook19633
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python137k27.5k
jayleicn/ClipBERT
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
Language:Python71486
m-bain/frozen-in-time
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
Language:Python35543
XU-TIANYANG/LSDCF
Learning Low-rank and Sparse Discriminative Correlation Filters for Coarse-to-Fine Visual Object Tracking
Language:MATLAB102
m-bain/webvid
Large-scale text-video dataset. 10 million captioned short videos.
Language:Python61239
CRIPAC-DIG/GHRM
[WWW 2021] Source code and datasets for the paper "Graph-based Hierarchical Relevance Matching Signals for Ad-hoc Retrieval".
Language:Python87
HuiChen24/IMRAM
code for our CVPR2020 paper "IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval"
Language:Python9429
kuanghuei/SCAN
PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
Language:Python554113
divelab/DeeperGNN
Official PyTorch implementation of "Towards Deeper Graph Neural Networks" [KDD2020]
Language:Python6916