hlchen23

A Two-year Ph.D. student at Tsinghua University now. My research interests focus on multimodal learning and LLM. chenhl23@mails.tsinghua.edu.cn

THU

hlchen23's Stars

Dongping-Chen/ISG
Official code repository for Interleaved Scene Graph.
Language:Python131
TencentQQGYLab/ELLA
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Language:Python1.1k59
mit-han-lab/vila-u
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Language:Python2003
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
Language:Python7.2k554
allenai/unified-io-2
Language:Python58827
SkyworkAI/Vitron
NeurIPS 2024 Paper: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Language:Python46327
dvirsamuel/PDM
Code for our paper: "Where's Waldo: Diffusion Features For Personalized Segmentation and Retrieval".
Language:Python102
TimeMarker-LLM/TimeMarker
A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
78
layer6ai-labs/xpool
https://layer6ai-labs.github.io/xpool/
Language:Python1189
xuguohai/X-CLIP
An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"
Language:Python14515
hrtang22/MUSE
Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval"
Language:Python16
WHB139426/Grounded-Video-LLM
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Language:Python754
soraw-ai/Awesome-Text-to-Video-Generation
A list for Text-to-Video, Image-to-Video works
21611
guyyariv/TempoTokens
This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
Language:Python11113
microsoft/i-Code
Language:Jupyter Notebook1.7k163
ChenHsing/SimDA
[CVPR 2024] SimDA: Simple Diffusion Adapter for Efficient Video Generation
Language:Python1234
dvlab-research/Video-P2P
Video-P2P: Video Editing with Cross-attention Control
Language:Python39326
huangmozhi9527/GMMFormer
[AAAI 2024] GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
Language:Python142
hlchen23/VERIFIED
Official repository of NeurIPS D&B Track 2024 paper "VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding" http://arxiv.org/abs/2410.08593
301
asuc-octo/berkeleytime
UC Berkeley enrollment info
Language:Python569
enkeejunior1/Diffusion-Pullback
Official Implementation of understanding the latent space of diffusion models through the lens of riemannian geometry (NeurIPS 2023)
Language:Python797
diffusion-hyperfeatures/diffusion_hyperfeatures
Official PyTorch Implementation for Diffusion Hyperfeatures, NeurIPS 2023
Language:Jupyter Notebook969
google-research/readout_guidance
Official PyTorch Implementation for Readout Guidance, CVPR 2024
Language:Jupyter Notebook1369
Carmenw1203/DanceCamAnimator-Official
DanceCamAnimator: Keyframe-Based Controllable 3D Dance Camera Synthesis. [ACMMM 2024] Official PyTorch implementation
Language:Python26
Dai-Wenxun/MotionLCM
[ ECCV 2024 ] MotionLCM: This repo is the official implementation of "MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model"
Language:Python29313
diffusion-motion-transfer/diffusion-motion-transfer
Official Pytorch Implementation for "Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer""
Language:Python16017
dongzhuoyao/Diffusion-Representation-Learning-Survey-Taxonomy
70
ali-vilab/VGen
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
Language:Python3k269
uncbiag/Awesome-Foundation-Models
A curated list of foundation models for vision and language tasks
90640
lhanchao777/LVLM-Hallucinations-Survey
This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and continuously update our survey, we maintain this repository of relevant references.
574

hlchen23

hlchen23's Stars

Dongping-Chen/ISG

TencentQQGYLab/ELLA

mit-han-lab/vila-u

LargeWorldModel/LWM

allenai/unified-io-2

SkyworkAI/Vitron

dvirsamuel/PDM

TimeMarker-LLM/TimeMarker

layer6ai-labs/xpool

xuguohai/X-CLIP

hrtang22/MUSE

WHB139426/Grounded-Video-LLM

soraw-ai/Awesome-Text-to-Video-Generation

guyyariv/TempoTokens

microsoft/i-Code

ChenHsing/SimDA

dvlab-research/Video-P2P

huangmozhi9527/GMMFormer

hlchen23/VERIFIED

asuc-octo/berkeleytime

enkeejunior1/Diffusion-Pullback

diffusion-hyperfeatures/diffusion_hyperfeatures

google-research/readout_guidance

Carmenw1203/DanceCamAnimator-Official

Dai-Wenxun/MotionLCM

diffusion-motion-transfer/diffusion-motion-transfer

dongzhuoyao/Diffusion-Representation-Learning-Survey-Taxonomy

ali-vilab/VGen

uncbiag/Awesome-Foundation-Models

lhanchao777/LVLM-Hallucinations-Survey