liuzhihui2046's Stars
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
lllyasviel/Fooocus
Focus on prompting and generating
hacksider/Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image
meta-llama/llama3
The official Meta Llama 3 GitHub site
Mikoto10032/DeepLearning
深度学习入门教程, 优秀文章, Deep Learning Tutorial
AliaksandrSiarohin/first-order-model
This repository contains the source code for the paper First Order Motion Model for Image Animation
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
open-mmlab/mmagic
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
google/gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
HVision-NKU/StoryDiffusion
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
sczhou/ProPainter
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
lllyasviel/IC-Light
More relighting!
facebookincubator/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
facebookresearch/co-tracker
CoTracker is a model for tracking any point (pixel) on a video.
ostris/ai-toolkit
Various AI scripts. Mostly Stable Diffusion stuff.
AiuniAI/Unique3D
[NeurIPS 2024] Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
williamyang1991/Rerender_A_Video
[SIGGRAPH Asia 2023] Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
xiaobai1217/Awesome-Video-Datasets
Video datasets
Fictionarry/ER-NeRF
[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis
Tencent/DepthCrafter
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
JackAILab/ConsistentID
Customized ID Consistent for human
phoenix104104/fast_blind_video_consistency
Learning Blind Video Temporal Consistency (ECCV 2018)
FusionBrainLab/HairFastGAN
Official Implementation for "HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach"
WisconsinAIVision/ViP-LLaVA
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
XiaoyuShi97/VideoFlow
Official implementation of ICCV2023 VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
lyogavin/train_your_own_sora
XiaoyuShi97/FlowFormerPlusPlus
FlowFormer++: Masked Cost Volume Autoencoding for Pretraining Optical Flow Estimation