huangmozhi9527

Just let it go.

Tsinghua UniversityShenzhen, China

huangmozhi9527's Stars

yuweihao/MambaOut
MambaOut: Do We Really Need Mamba for Vision?
Language:Python2k 8 24434
OpenDriveLab/DriveAGI
[CVPR 2024 Highlight] GenAD: Generalized Predictive Model for Autonomous Driving & Foundation Models in Autonomous System
Language:Python612 29 1124
magic-research/PLLaVA
Official repository for the paper PLLaVA
Language:Python593 13 7740
OpenDriveLab/OpenLane-V2
[NeurIPS 2023 Track Datasets and Benchmarks] OpenLane-V2: The First Perception and Reasoning Benchmark for Road Driving
Language:Jupyter Notebook558 21 10866
rese1f/MovieChat
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Language:Python527 10 8041
OpenDriveLab/PersFormer_3DLane
[ECCV 2022 Oral] Perspective Transformer on 3D Lane Detection
Language:Python433 14 12578
RenShuhuai-Andy/TimeChat
[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Language:Python289 5 4525
OpenDriveLab/TopoNet
Topology Reasoning for Scene Perception in Autonomous Driving
Language:Python286 23 2112
reka-ai/reka-vibe-eval
Multimodal language model benchmark, featuring challenging examples
Language:Python149 16 46
HJYao00/DenseConnector
【NeurIPS 2024】Dense Connector for MLLMs
Language:Python140 3 75
jpthu17/DiffusionRet
[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
Language:Python120 3 106
ByungKwanLee/Meteor
[NeurIPS 2024] Official PyTorch implementation code for realizing the technical part of Mamba-based traversal of rationale (Meteor) to improve performance of numerous vision language performances for diverse capabilities.
Language:Python102 1 54
liguopeng0923/UCVGL
[CVPR 2024🔥] Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization
Language:Python91 2 10
IMCCretrieval/ProST
Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval --ICCV2023 Oral
Language:Python90 3 71
IMCCretrieval/MomentDiff
MomentDiff: Generative Video Moment Retrieval from Random to Real--NeurIPS 2023
Language:Python75 4 90
yangnianzu0515/MoleRec
The official implementation of our paper "MoleRec: Combinatorial Drug Recommendation with Substructure-Aware Molecular Representation Learning" (TheWebConf 2023).
Language:Python55 1 64
whwu95/FreeVA
FreeVA: Offline MLLM as Training-Free Video Assistant
Language:Python48 2 70
TencentARC/TVTS
Turning to Video for Transcript Sorting
Language:Jupyter Notebook46 4 22
mbzuai-oryx/CVRR-Evaluation-Suite
Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs".
Language:Python42 0 03
gimpong/MM23-MISSRec
The code for the paper "MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation" (ACM MM'23).
Language:Python37 1 57
gengyuanmax/MeVTR
Official github repo for ICCV2023 paper 'Multi-event Video-Text Retrieval'
Language:Python18 2 60
EricLee8/Multi-party-Dialogue-MRC
Codes and data for EMNLP 2021 paper "Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Reading Comprehension"
Language:Python16 2 21
EricLee8/BiDeN
The official code of our paper at EMNLP 2022: Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling
Language:Python15 1 10
huangmozhi9527/GMMFormer
[AAAI 2024] GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
Language:Python14 2 12
HuiGuanLab/DL-DKD
Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval
Language:Python13 2 21
duyali2000/MQMC
This repo has the PyTorch implementation and datasets of our WSDM 2023 paper: “Multi-queue Momentum Contrast for Microvideo-Product Retrieval”.
Language:Python10 2 11
EricLee8/MPD_EMVI
Official implementation of our paper at ACL 2023: Pre-training Multi-party Dialogue Models with Latent Discourse Inference
Language:Python10 1 00
EricLee8/SPACE
The official codes for our paper at COLING 2022: Semantic-Preserving Adversarial Code Comprehension
Language:Python9 1 00
sangminwoo/AvisC
Official pytorch implementation of "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models"
Language:Python8 2 60
huangmozhi9527/GMMFormer_v2
GMMFormer v2: An Uncertainty-aware Framework for Partially Relevant Video Retrieval
Language:Python7 1 11

huangmozhi9527

huangmozhi9527's Stars

yuweihao/MambaOut

OpenDriveLab/DriveAGI

magic-research/PLLaVA

OpenDriveLab/OpenLane-V2

rese1f/MovieChat

OpenDriveLab/PersFormer_3DLane

RenShuhuai-Andy/TimeChat

OpenDriveLab/TopoNet

reka-ai/reka-vibe-eval

HJYao00/DenseConnector

jpthu17/DiffusionRet

ByungKwanLee/Meteor

liguopeng0923/UCVGL

IMCCretrieval/ProST

IMCCretrieval/MomentDiff

yangnianzu0515/MoleRec

whwu95/FreeVA

TencentARC/TVTS

mbzuai-oryx/CVRR-Evaluation-Suite

gimpong/MM23-MISSRec

gengyuanmax/MeVTR

EricLee8/Multi-party-Dialogue-MRC

EricLee8/BiDeN

huangmozhi9527/GMMFormer

HuiGuanLab/DL-DKD

duyali2000/MQMC

EricLee8/MPD_EMVI

EricLee8/SPACE

sangminwoo/AvisC

huangmozhi9527/GMMFormer_v2