tenaflyyy

Hangzhou Normal University

tenaflyyy's Stars

OpenMICG/SwiftCraft3D
Efficient Text-to-3D Generation via Semantic-enhanced Sparse-view Prompting with Hybrid Reconstruction
Language:Python2
yformer/EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
Language:Jupyter Notebook2.1k151
OpenMICG/CoCoMeD
Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question Answering
Language:Python101
OpenMICG/AHP
Adapter-Enhanced Hierarchical Cross-Modal Pre-training for Lightweight Medical Report Generation
Language:Python9
OpenMICG/MossVLN
Observation Driven Memory Synergistic Planning for Continuous Vision-Language Navigation
Language:Python8
OpenMICG/CSLAKE
A consistent Med-VQA dataset, C-SLAKE , extended by Slake for further consistency assessment .
11
tenaflyyy/CoCoMeD
Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question Answering
1
OpenMICG/mcg
Multigranularity Contrastive cross-modal collaborative Generation (MCG) model for Video QA
Language:Python102
jayleicn/TVQAplus
[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
Language:Python12324
jayleicn/TVQA
[EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering
Language:Python17032
dingmyu/VRDP
[NeurIPS 2021] Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
Language:Python457
chuangg/CLEVRER
PyTorch implementation of ICLR 2020 paper "CLEVRER: CoLlision Events for Video REpresentation and Reasoning"
Language:Python11126
VRU-NExT/VideoQA
776
chenfei-wu/TaskMatrix
Language:Python34.5k3.3k
MILVLG/prophet
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
Language:Python26327
Victorwz/VaLM
VaLM: Visually-augmented Language Modeling. ICLR 2023.
Language:Python553
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Language:Jupyter Notebook4.7k624
CCIIPLab/DPT
The code of IJCAI2022 paper, Declaration-based Prompt Tuning for Visual Question Answering
Language:Python192
CurryYuan/X-Trans2Cap
[CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
Language:Python333
yuewang-cuhk/awesome-vision-language-pretraining-papers
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
1.1k101
yz93/LAVT-RIS
Language:Python17914
minghangz/cpl
CPL: Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning
Language:Python565
floodsung/Deep-Learning-Papers-Reading-Roadmap
Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech!
Language:Python38k7.3k
facebookresearch/TimeSformer
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
Language:Python1.5k210
antoyang/just-ask
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Language:Jupyter Notebook11715
facebookresearch/detr
End-to-End Object Detection with Transformers
Language:Python13.4k2.4k
mttr2021/MTTR
Language:Python64069
tenaflyyy/ClipBERT
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
1
tenaflyyy/hcrn-videoqa
Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)
1
thaolmk54/hcrn-videoqa
Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)
Language:Python13026

tenaflyyy

tenaflyyy's Stars

OpenMICG/SwiftCraft3D

yformer/EfficientSAM

OpenMICG/CoCoMeD

OpenMICG/AHP

OpenMICG/MossVLN

OpenMICG/CSLAKE

tenaflyyy/CoCoMeD

OpenMICG/mcg

jayleicn/TVQAplus

jayleicn/TVQA

dingmyu/VRDP

chuangg/CLEVRER

VRU-NExT/VideoQA

chenfei-wu/TaskMatrix

MILVLG/prophet

Victorwz/VaLM

salesforce/BLIP

CCIIPLab/DPT

CurryYuan/X-Trans2Cap

yuewang-cuhk/awesome-vision-language-pretraining-papers

yz93/LAVT-RIS

minghangz/cpl

floodsung/Deep-Learning-Papers-Reading-Roadmap

facebookresearch/TimeSformer

antoyang/just-ask

facebookresearch/detr

mttr2021/MTTR

tenaflyyy/ClipBERT

tenaflyyy/hcrn-videoqa

thaolmk54/hcrn-videoqa