xyxxmb's Stars
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
lllyasviel/Omost
Your image is almost there!
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Tencent/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
BadToBest/EchoMimic
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
XLabs-AI/x-flux
XLabs-AI/x-flux-comfyui
megvii-research/HiDiffusion
[ECCV 2024] HiDiffusion: Increases the resolution and speed of your diffusion model by only adding a single line of code!
cubiq/PuLID_ComfyUI
PuLID native implementation for ComfyUI
csyxwei/ELITE
ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation (ICCV 2023, Oral)
AIGText/Glyph-ByT5
[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering""
maitrix-org/Pandora
Pandora: Towards General World Model with Natural Language Actions and Video States
hehao13/CameraCtrl
wyysf-98/CraftsMan
CraftsMan: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner
MC-E/ReVideo
sail-sg/CLoT
CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation".
haoosz/ViCo
Official PyTorch codes for the paper: "ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation"
open-mmlab/StyleShot
StyleShot: A SnapShot on Any Style. 一款可以迁移任意风格到任意内容的模型,无需针对图片微调,即能生成高质量的个性风格化图片!
zibojia/COCOCO
Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility.
bytedance/MoMA
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Ling-APE/ComfyUI-All-in-One-FluxDev-Workflow
An All-in-One FluxDev workflow in ComfyUI that combines various techniques for generating images with the FluxDev model, including img-to-img and text-to-img. This workflow can use LoRAs, ControlNets, enabling negative prompting with Ksampler, dynamic thresholding, inpainting, and more.
discus0434/aesthetic-predictor-v2-5
SigLIP-based Aesthetic Score Predictor
stylus-diffusion/stylus
PairCustomization/PairCustomization
Mowenyii/PAE
[CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation
CodeGoat24/Face-diffuser
[CVPR2024] Official implementation of High-fidelity Person-centric Subject-to-Image Synthesis.
wfanyue/DPG-T2I-Personalization
[ECCV 2024] Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning
PrototypeNx/DETEX
Decoupled Textual Embeddings for Customized Image Generation (AAAI 2024)