ChrisLiu6's Stars
black-forest-labs/flux
Official inference repo for FLUX.1 models
guoyww/AnimateDiff
Official implementation of AnimateDiff.
microsoft/DeepSpeedExamples
Example models using DeepSpeed
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
CompVis/taming-transformers
Taming Transformers for High-Resolution Image Synthesis
FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
ChenHsing/Awesome-Video-Diffusion-Models
[CSUR] A Survey on Video Diffusion Models
apple/ml-4m
4M: Massively Multimodal Masked Modeling
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
LTH14/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
ckkelvinchan/RealBasicVSR
Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"
Mukosame/Zooming-Slow-Mo-CVPR-2020
Fast and Accurate One-Stage Space-Time Video Super-Resolution (accepted in CVPR 2020)
GAIR-NLP/anole
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Alpha-VLLM/Lumina-mGPT
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
Vchitect/VEnhancer
Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation
AILab-CVC/SEED-X
Multimodal Models in Real World
louaaron/Score-Entropy-Discrete-Diffusion
[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
Meituan-AutoML/VisionLLaMA
VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
mira-space/MiraData
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
sail-sg/zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism
Picsart-AI-Research/VideoINR-Continuous-Space-Time-Super-Resolution
[CVPR 2022] VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution
NJU-PCALab/OpenVid-1M
valeoai/Maskgit-pytorch
unofficial MaskGIT reproduction in PyTorch
gladia-research-group/multi-source-diffusion-models
danier97/LDMVFI
[AAAI'2024] "LDMVFI: Video Frame Interpolation with Latent Diffusion Models", Duolikun Danier, Fan Zhang, David Bull
XiaolongTang23/HPNet
[CVPR 2024] HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention
liuggchen/wechatDatDecode
微信dat文件解码,Windows系统下载exe文件可直接使用。
PhyscalX/gradio-image-prompter
Image Prompter for Gradio
LiuDongyang6/METR
A Simple Romance Between Multi-Exit Vision Transformer and Token Reduction (ICLR 2024)