zhangquanwei962's Stars
magic-research/magic-animate
[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
apple/ml-mgie
haofanwang/cropimage
A simple toolkit for detecting and cropping main body from pictures. Support face and saliency detection.
mini-sora/minisora
MiniSora: A community aims to explore the implementation path and future development direction of Sora.
yzhang2016/video-generation-survey
A reading list of video generation
open-mmlab/PowerPaint
[ECCV 2024] PowerPaint, a versatile image inpainting model that supports text-guided object inpainting, object removal, image outpainting and shape-guided object inpainting with only a single model. 一个高质量多功能的图像修补模型,可以同时支持插入物体、移除物体、图像扩展、形状可控的物体生成,只需要一个模型
vislearn/ControlNet-XS
cientgu/InstructDiffusion
PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.
webtoon/dreamstyler
Official implementation of "DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models" (AAAI24)
FreeStyleFreeLunch/FreeStyle
FreeStyle : Free Lunch for Text-guided Style Transfer using Diffusion Models
google/style-aligned
Official code for "Style Aligned Image Generation via Shared Attention"
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
TencentARC/MasaCtrl
[ICCV 2023] Consistent Image Synthesis and Editing
Ucas-HaoranWei/Vary
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
yeungchenwa/FontDiffuser
[AAAI2024] FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
genforce/freecontrol
Official implementation of CVPR 2024 paper: "FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition"
nmhkahn/dreamstyler
sled-group/InfEdit
[CVPR 2024] Official implementation of CVPR 2024 paper: "Inversion-Free Image Editing with Natural Language"
Jamie-Cheung/ArtBank
ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank (AAAI2024)
tesseract-ocr/tesseract
Tesseract Open Source OCR Engine (main repository)
01-ai/Yi
A series of large language models trained from scratch by developers @01-ai
OPPO-Mente-Lab/Subject-Diffusion
Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
tencent-ailab/IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
AILab-CVC/VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Sanster/xy-cut
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
Yuxinn-J/Scenimefy
[ICCV 2023] Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation
langmanbusi/Semantic-Aware-Low-Light-Image-Enhancement
Semantic-Aware LLIE. CVPR 2023
AlenUbuntu/StyleTransfer
an PyTorch image deep style transfer library. It provies implementations of current SOTA algorithms, including AdaIN, WCT, LinearStyleTransfer, and FastPhotoTransfer