Pinned Repositories
animeGAN
A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.
ClipBERT
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
moment_detr
[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
recurrent-transformer
[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
scipy-lecture-notes-zh-CN
中文版scipy-lecture-notes. 网站下线, 以离线HTML的形式继续更新, 见release.
singularity
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
TVCaption
[ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset
TVQA
[EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering
TVQAplus
[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
TVRetrieval
[ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
jayleicn's Repositories
jayleicn/animeGAN
A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.
jayleicn/ClipBERT
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
jayleicn/moment_detr
[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
jayleicn/recurrent-transformer
[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
jayleicn/TVQA
[EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering
jayleicn/TVRetrieval
[ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
jayleicn/singularity
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
jayleicn/TVQAplus
[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
jayleicn/TVCaption
[ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset
jayleicn/VideoLanguageFuturePred
[EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction
jayleicn/mTVRetrieval
[ACL 2021] mTVR: Multilingual Video Moment Retrieval
jayleicn/pytorch-pretrained-BERT
A copy from https://github.com/huggingface/pytorch-pretrained-BERT
jayleicn/video_feature_extractor
Easy to use video deep features extractor
jayleicn/2D-TAN
AAAI‘20 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language
jayleicn/accelerate
A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
jayleicn/ALPRO
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
jayleicn/CLIP
Contrastive Language-Image Pretraining
jayleicn/coot-videotext
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
jayleicn/detr
End-to-End Object Detection with Transformers
jayleicn/easyturk
Make quick mechanical turk HTML/Javascript interfaces and launch them with Python functions
jayleicn/HERO-1
Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
jayleicn/info-ground
Learning phrase grounding from captioned images through InfoNCE bound on mutual information
jayleicn/just-ask
[TPAMI Special Issue on ICCV 2021 Best Papers, Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
jayleicn/mmaction2-1
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
jayleicn/mmf-1
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
jayleicn/Oscar
Oscar and VinVL
jayleicn/releasing-research-code
Tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations)
jayleicn/SlowFast
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
jayleicn/UNITER
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
jayleicn/YouCook2-Leaderboard
A one-stop shop for YouCook2 info such as leaderboard and recent advances on (cooking) video retrieval and captioning.