yanan1989

I am a research engineer at KDDI Research.

KDDI Research

Pinned Repositories

acmart
ACM consolidated LaTeX styles
Language:TeX0 0 00
ActionGenome
A video database bridging human actions and human-object relationships
Language:Python0 0 00
Additional-EmotiW-dataset
Additional datasets for the group-based cohesion and emotion understanding tasks. It contains situation description text for static and dynamic visual data.
40
AGQA_baselines_code
Language:Python0 0 00
An-awsome-homePageSample
Language:HTML0 0 00
bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
Language:Jupyter Notebook0 0 00
CMU-MultimodalSDK
CMU MultimodalSDK is a machine learning platform for development of advanced multimodal models as well as easily accessing advanced multimodal datasets.
Language:Python0 0 00
DenseNet-Cifar10
Train DenseNet on Cifar-10 based on Keras
Language:Python0 0 00
dest_agqa
The official implementation of Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling (BMVC 2022 Spotlight).
Language:Python1 0 00
ICCV-2023-Papers
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!
1 0 00

yanan1989's Repositories

yanan1989/Additional-EmotiW-dataset
Additional datasets for the group-based cohesion and emotion understanding tasks. It contains situation description text for static and dynamic visual data.
40
yanan1989/dest_agqa
The official implementation of Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling (BMVC 2022 Spotlight).
Language:Python1 0 00
yanan1989/ICCV-2023-Papers
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!
1 0 00
yanan1989/AGQA_baselines_code
Language:Python0 0 00
yanan1989/An-awsome-homePageSample
Language:HTML0 0 00
yanan1989/Distillation_methods
[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods
Language:Python0 0
yanan1989/Driving-with-LLMs
PyTorch implementation for the paper "Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving"
yanan1989/FERV39k
0 0
yanan1989/GraphCLIP_VGT
Video Graph Transformer for Video Question Answering (ECCV'22)
Language:Python0 0
yanan1989/home-robot
Mobile manipulation research tools for roboticists
Language:Python0 0
yanan1989/KP-GNN
Source code for how powerful are K-hop message passing graph neural networks (Neurips 2022)
Language:Python0 0
yanan1989/LinkBERT
[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links
Language:Python0 0
yanan1989/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python0 0
yanan1989/merlot
MERLOT: Multimodal Neural Script Knowledge Models
Language:Python0 0
yanan1989/moma
A dataset for multi-object multi-actor activity parsing
Language:Jupyter Notebook0 0
yanan1989/moma-hypergraph
yanan1989/moma-model
Language:Python0 0
yanan1989/momatools
Language:Python0 0
yanan1989/online-cv
A minimal Jekyll Theme to host your resume (CV)
yanan1989/PromptKG
PromptKG Family: a Gallery of Prompt Learning & KG-related research works, toolkits, and paper-list.
Language:Python0 0
yanan1989/qagnn
[NAACL 2021] QAGNN: Question Answering using Language Models and Knowledge Graphs 🤖
Language:Python0 0
yanan1989/Scene-Graph-Benchmark.pytorch
A new codebase for popular Scene Graph Generation methods (2020). Visualization & Scene Graph Extraction on custom images/datasets are provided. It's also a PyTorch implementation of paper “Unbiased Scene Graph Generation from Biased Training CVPR 2020”
Language:Jupyter Notebook0 0
yanan1989/SGAE
yanan1989/STTran
Spatial-Temporal Transformer for Dynamic Scene Graph Generation, ICCV2021
Language:Jupyter Notebook0 0
yanan1989/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Language:Python0 0
yanan1989/ViHGNN
[ICCV2023] "Vision HGNN: An Image is More than a Graph of Nodes" by Yan Han, Peihao Wang, Souvik Kundu, Ying Ding, and Zhangyang Wang
Language:Python0 0
yanan1989/vilbert_beta
Language:Jupyter Notebook0 0
yanan1989/VS3_CVPR23
Code for CVPR23 paper: Learning to Generate Language-supervised and Open-vocabulary Scene Graph using Pre-trained Visual-Semantic Space
Language:Python0 0
yanan1989/webvid
Large-scale text-video dataset. 10 million captioned short videos.
Language:Python0 0
yanan1989/yanan1989.github.io
Language:CSS