songjin321

哈尔滨工业大学

songjin321's Stars

lllyasviel/LuminaBrush
Illumination Drawing Tools for Text-to-Image Diffusion Models
3758
facebookresearch/flow_matching
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
Language:Python1.6k63
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
Language:Jupyter Notebook1.7k100
microsoft/TRELLIS
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
Language:Python5.6k345
Shakker-Labs/ComfyUI-IPAdapter-Flux
Language:Python22315
NVlabs/addit
2243
instantX-research/Regional-Prompting-FLUX
Training-free Regional Prompting for Diffusion Transformers 🔥
Language:Python48619
gokayfem/ComfyUI_VLM_nodes
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
Language:Python43641
kornia/kornia
Geometric Computer Vision Library for Spatial AI
Language:Python10.1k978
bghira/SimpleTuner
A general fine-tuning kit geared toward diffusion models.
Language:Python1.9k184
facebookresearch/sapiens
High-resolution models for human tasks.
Language:Python4.7k268
tyxsspa/AnyText
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
Language:Python4.4k286
FudanVI/benchmarking-chinese-text-recognition
This repository contains datasets and baselines for benchmarking Chinese text recognition.
Language:Python44352
ostris/ai-toolkit
Various AI scripts. Mostly Stable Diffusion stuff.
Language:Python3.7k411
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python13k906
AIGText/Glyph-ByT5
[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering""
Language:Jupyter Notebook52823
black-forest-labs/flux
Official inference repo for FLUX.1 models
Language:Python18.7k1.3k
yk7333/d3po
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
Language:Python17918
Haian-Jin/Neural_Gaffer
[NeurIPS 2024] Official code for "Neural Gaffer: Relighting Any Object via Diffusion"
Language:Python2416
Chanzhaoyu/chatgpt-web
用 Express 和 Vue3 搭建的 ChatGPT 演示网页
Language:Vue31.7k11.2k
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Language:Python5.6k472
lllyasviel/Omost
Your image is almost there!
Language:Python7.4k427
instantX-research/InstantID
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
Language:Python11.3k823
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Language:Python2.9k182
ViTAE-Transformer/ViTPose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
Language:Python1.4k188
Q-Future/Q-Align
③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.
Language:Python32222
ziqihuangg/Awesome-Evaluation-of-Visual-Generation
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
21512
nullquant/ComfyUI-BrushNet
ComfyUI BrushNet nodes
Language:Python69825
lllyasviel/IC-Light
More relighting!
Language:Python7.2k415
HVision-NKU/StoryDiffusion
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
Language:Jupyter Notebook6.1k608

songjin321

songjin321's Stars

lllyasviel/LuminaBrush

facebookresearch/flow_matching

YangLing0818/RPG-DiffusionMaster

microsoft/TRELLIS

Shakker-Labs/ComfyUI-IPAdapter-Flux

NVlabs/addit

instantX-research/Regional-Prompting-FLUX

gokayfem/ComfyUI_VLM_nodes

kornia/kornia

bghira/SimpleTuner

facebookresearch/sapiens

tyxsspa/AnyText

FudanVI/benchmarking-chinese-text-recognition

ostris/ai-toolkit

OpenBMB/MiniCPM-V

AIGText/Glyph-ByT5

black-forest-labs/flux

yk7333/d3po

Haian-Jin/Neural_Gaffer

Chanzhaoyu/chatgpt-web

THUDM/GLM-4

lllyasviel/Omost

instantX-research/InstantID

PixArt-alpha/PixArt-alpha

ViTAE-Transformer/ViTPose

Q-Future/Q-Align

ziqihuangg/Awesome-Evaluation-of-Visual-Generation

nullquant/ComfyUI-BrushNet

lllyasviel/IC-Light

HVision-NKU/StoryDiffusion