Pinned Repositories
CV-VAE
[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
FreeNoise
[ICLR 2024] Code for FreeNoise based on VideoCrafter
GPT4Tools
GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.
SEED
Official implementation of SEED-LLaMA (ICLR 2024).
SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
SEED-X
Multimodal Models in Real World
TaleCrafter
[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters
UniRepLKNet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
TencentAILab-CVC's Repositories
AILab-CVC/YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
AILab-CVC/VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
AILab-CVC/UniRepLKNet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
AILab-CVC/GPT4Tools
GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.
AILab-CVC/SEED
Official implementation of SEED-LLaMA (ICLR 2024).
AILab-CVC/SEED-X
Multimodal Models in Real World
AILab-CVC/FreeNoise
[ICLR 2024] Code for FreeNoise based on VideoCrafter
AILab-CVC/SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
AILab-CVC/CV-VAE
[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
AILab-CVC/TaleCrafter
[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters
AILab-CVC/Animate-A-Story
Retrieval-Augmented Video Generation for Telling a Story
AILab-CVC/VideoGen-Eval
The Dawn of Video Generation: Preliminary Explorations with SORA-like Models
AILab-CVC/Make-Your-Video
[IEEE TVCG 2024] Customized Video Generation Using Textual and Structural Guidance
AILab-CVC/GroupMixFormer
GroupMixAttention and GroupMixFormer
AILab-CVC/M2PT
[CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
AILab-CVC/VL-GPT
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation
AILab-CVC/HiFi-123
[ECCV 2024] HiFi-123: Towards High-fidelity One Image to 3D Content Generation
AILab-CVC/AILab-CVC.github.io
Homepage of Tencent AI Lab CVC.