ttengwang
Ph.D. student in computer science. My research interests lie in deep learning and computer vision, focusing on vision-language multimodal learning.
The University of Hong KongHong Kong
Pinned Repositories
FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
action-detection
temporal action detection with SSN
Awesome_Long_Form_Video_Understanding
Awesome papers & datasets specifically focused on long-term videos.
Awesome_Prompting_Papers_in_Computer_Vision
A curated list of prompt-based paper in computer vision and vision-language learning.
Caption-Anything
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything
dense-video-captioning-pytorch
Second-place solution to dense video captioning task in ActivityNet Challenge (CVPR 2020 workshop)
ECHR
Code for paper "Event-centric hierarchical representation for dense video captioning" (TCSVT2020)
ESGN
Event Sequence Generation Network
PDVC
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
VLMixer
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix (ICML 2022)
ttengwang's Repositories
ttengwang/Caption-Anything
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything
ttengwang/Awesome_Prompting_Papers_in_Computer_Vision
A curated list of prompt-based paper in computer vision and vision-language learning.
ttengwang/PDVC
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
ttengwang/Awesome_Long_Form_Video_Understanding
Awesome papers & datasets specifically focused on long-term videos.
ttengwang/dense-video-captioning-pytorch
Second-place solution to dense video captioning task in ActivityNet Challenge (CVPR 2020 workshop)
ttengwang/VLMixer
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix (ICML 2022)
ttengwang/ESGN
Event Sequence Generation Network
ttengwang/ECHR
Code for paper "Event-centric hierarchical representation for dense video captioning" (TCSVT2020)
ttengwang/awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
ttengwang/awesome-Vision-and-Language-Pre-training
Recent Advances in Vision and Language Pre-training (VLP)
ttengwang/awesome-vision-language-pretraining-papers
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
ttengwang/cider
python codes for CIDEr - Consensus-based Image Caption Evaluation
ttengwang/coco-caption
ttengwang/densecap
Dense video captioning in PyTorch
ttengwang/densevid_eval
Evaluation code for Dense-Captioning Events in Videos
ttengwang/ENAS-pytorch
PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"
ttengwang/EVA
EVA Series: Visual Representation Fantasies from BAAI
ttengwang/faster-rcnn.pytorch
A faster pytorch implementation of faster r-cnn
ttengwang/grounding_changing_distribution
ttengwang/hidden-networks
ttengwang/ImageCaptioning.pytorch
image captioning codebase in pytorch(finetunable cnn in branch "with_finetune";diverse beam search can be found in 'dbs' branch; self-critical training is under my self-critical.pytorch repository.)
ttengwang/lmms-eval
Accelerating the development of large multimodal models (LMMs) with lmms-eval
ttengwang/merlot
MERLOT: Multimodal Neural Script Knowledge Models
ttengwang/PrefixTuning
Prefix-Tuning: Optimizing Continuous Prompts for Generation
ttengwang/PromptPapers
Must-read papers on prompt-based tuning for pre-trained language models.
ttengwang/rosita
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
ttengwang/self-critical.pytorch
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
ttengwang/slowfast_feature_extractor
Feature Extractor module for videos using the PySlowFast framework
ttengwang/STR
TMM: show, tell and rephrase
ttengwang/ttengwang