Cece1031

429107705@qq.com

CUMT->UESTC

Cece1031's Stars

BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
12k768
xudejing/video-question-answering
Video Question Answering via Gradually Refined Attention over Appearance and Motion
Language:Python15027
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Language:Python3.2k320
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python5.6k439
bytedance/SALMONN
SALMONN: Speech Audio Language Music Open Neural Network
Language:Python99577
schowdhury671/meerkat
Language:Python10
DAMO-NLP-SG/VCD
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Language:Python1828
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook9.7k952
rikeilong/Bay-CAT
[ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
Language:Python361
GeWu-Lab/MUSIC-AVQA
MUSIC-AVQA, CVPR2022 (ORAL)
Language:Python667
Ziyang412/VideoTree
Code for paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
Language:Python703
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Language:Python3k247
gyxxyg/VTG-LLM
[Preprint] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
Language:Python501
MengyuanChen21/CVPR2023-CMPAE
[CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
Language:Python343
ExplainableML/AVCA-GZSL
This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language"
Language:Python331
hlchen23/ADPN-MM
Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Grounding"
Language:Python392
lucidrains/mirasol-pytorch
Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch
Language:Python871
kyegomez/Mirasol
Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"
Language:Python24
sauradip/MUPPET
[ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"
Language:Python14
ttgeng233/UnAV
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
Language:Python544
OFA-Sys/ONE-PEACE
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Language:Python94359
sauradip/STALE
[ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot Temporal Action Detection via Vision-Language Prompting "
Language:Python989
jianzongwu/Awesome-Open-Vocabulary
(TPAMI 2024) A Survey on Open Vocabulary Learning
80044
pengsida/learning_research
本人的科研经验
5.6k335
Cadene/pretrained-models.pytorch
Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.
Language:Python9k1.8k
52CV/CVPR-2023-Papers
92479
52CV/CV-Surveys
计算机视觉相关综述。包括目标检测、跟踪........
1.8k241
AccumulateMore/OpenCV
✔（已完结）最全面的 OpenCV 笔记【咕泡唐宇迪】
Language:Jupyter Notebook510127
statusrank/XCurve
XCurve is an end-to-end PyTorch library for X-Curve metrics optimizations in machine learning.
Language:Python1429
PKUFlyingPig/cs-self-learning
计算机自学指南
Language:HTML56.2k6.8k

Cece1031

Cece1031's Stars

BradyFU/Awesome-Multimodal-Large-Language-Models

xudejing/video-question-answering

NExT-GPT/NExT-GPT

OpenGVLab/InternVL

bytedance/SALMONN

schowdhury671/meerkat

DAMO-NLP-SG/VCD

salesforce/LAVIS

rikeilong/Bay-CAT

GeWu-Lab/MUSIC-AVQA

Ziyang412/VideoTree

OpenGVLab/Ask-Anything

gyxxyg/VTG-LLM

MengyuanChen21/CVPR2023-CMPAE

ExplainableML/AVCA-GZSL

hlchen23/ADPN-MM

lucidrains/mirasol-pytorch

kyegomez/Mirasol

sauradip/MUPPET

ttgeng233/UnAV

OFA-Sys/ONE-PEACE

sauradip/STALE

jianzongwu/Awesome-Open-Vocabulary

pengsida/learning_research

Cadene/pretrained-models.pytorch

52CV/CVPR-2023-Papers

52CV/CV-Surveys

AccumulateMore/OpenCV

statusrank/XCurve

PKUFlyingPig/cs-self-learning