R00Kie-Liu

Shanghai Jiao Tong UniversityShanghai

R00Kie-Liu's Stars

tychen-SJTU/MECD-Benchmark
[NeurIPS'24 spotlight] MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning
Language:Python20
hotelll/Collaborative_Procedure_Alignment
Implementation of our journal paper "Achieving Procedure-Aware Instructional Video Correlation Learning under Weak Supervision from a Collaborative Perspective"
Language:Python3
haowuxc/DIBS
[CVPR 2024] DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement
Language:Python3
Jiaxuan-Li/EVCap
[CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
Language:Python405
DirtyHarryLYL/LLM-in-Vision
Recent LLM-based CV and related works. Welcome to comment/contribute!
84836
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python33.2k3.6k
wdndev/llm_interview_note
主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题
Language:HTML4.4k509
TencentARC/ST-LLM
[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
Language:Python1344
EvolvingLMMs-Lab/lmms-eval
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
Language:Python2.2k178
bpiyush/TestOfTime
Official code for our CVPR 2023 paper: Test of Time: Instilling Video-Language Models with a Sense of Time
Language:Python453
llyx97/TempCompass
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, Lei Li, Sishuo Chen, Xu Sun, Lu Hou
Language:Python932
yunlong10/Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
1.7k88
YuxiXie/ECHo
This repository contains data and code for the paper ECHo: Event Causality Inference via Human-centric Reasoning.
Language:Python81
fudan-zvg/Reason2Drive
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
772
state-spaces/mamba
Mamba SSM architecture
Language:Python13.6k1.2k
MILVLG/prophet
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
Language:Python27027
R00Kie-Liu/Sampler
Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition
Language:Python131
AtomScott/soccer_narrator
narration for soccer
Language:Jupyter Notebook2
google-research/football
Check out the new game server:
Language:Python3.4k1.3k
zyayoung/Awesome-Video-LLMs
Explore VLM-Eval, a framework for evaluating Video Large Language Models, enhancing your video analysis with cutting-edge AI technology.
Language:Python302
vaishnaviHimakunthala/VIP
3
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
13.3k840
WisconsinAIVision/ViP-LLaVA
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Language:Python30421
ggoonnzzaallo/llm_experiments
I play with my best friend GPT
Language:Jupyter Notebook29445
mbzuai-oryx/Video-LLaVA
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
Language:Python24811
doc-doc/NExT-GQA
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
Language:Python621
XgDuan/WSDEC
Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.
Language:Python10426
Letian2003/C-VQA
Counterfactual Reasoning VQA Dataset
Language:Python242
BirdFly16/TO-MAR
5
ucas-vg/P2BNet
ECCV2022, Point-to-Box Network for Accurate Object Detection via Single Point Supervision
Language:Python6410

R00Kie-Liu

R00Kie-Liu's Stars

tychen-SJTU/MECD-Benchmark

hotelll/Collaborative_Procedure_Alignment

haowuxc/DIBS

Jiaxuan-Li/EVCap

DirtyHarryLYL/LLM-in-Vision

2noise/ChatTTS

wdndev/llm_interview_note

TencentARC/ST-LLM

EvolvingLMMs-Lab/lmms-eval

bpiyush/TestOfTime

llyx97/TempCompass

yunlong10/Awesome-LLMs-for-Video-Understanding

YuxiXie/ECHo

fudan-zvg/Reason2Drive

state-spaces/mamba

MILVLG/prophet

R00Kie-Liu/Sampler

AtomScott/soccer_narrator

google-research/football

zyayoung/Awesome-Video-LLMs

vaishnaviHimakunthala/VIP

BradyFU/Awesome-Multimodal-Large-Language-Models

WisconsinAIVision/ViP-LLaVA

ggoonnzzaallo/llm_experiments

mbzuai-oryx/Video-LLaVA

doc-doc/NExT-GQA

XgDuan/WSDEC

Letian2003/C-VQA

BirdFly16/TO-MAR

ucas-vg/P2BNet