Space-Xun's Stars
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
sebastianstarke/AI4Animation
Bringing Characters to Life with Computer Brains in Unity
ostris/ai-toolkit
Various AI scripts. Mostly Stable Diffusion stuff.
F4bwDP6a6W/FLY_US
美国大学备考资料 How to apply US colleges
baaivision/Emu3
Next-Token Prediction is All You Need
Vchitect/Latte
Latte: Latent Diffusion Transformer for Video Generation.
LTH14/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
sihyun-yu/REPA
Official Pytorch Implementation of Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
laylalaisy/TOEFL_laylalaisy
备考托福的一丢丢经验+资料~祝小可爱和大佬们都早日和托福大魔王分手(o゜▽゜)o☆
pengsongyou/openscene
[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies
JusticeFighterDance/JusticeFighter110
田柯宇 (Tian Keyu)恶意攻击集群事件的证据揭露
AIGText/Glyph-ByT5
[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering""
geopavlakos/hamer
HaMeR: Reconstructing Hands in 3D with Transformers
mit-han-lab/hart
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
DL3DV-10K/Dataset
News: the 10k dataset is ready for download.
autonomousvision/LaRa
[ECCV 2024] Efficient Large-Baseline Radiance Fields, a feed-forward 2DGS model
kuleshov-group/mdlm
Simplified Masked Diffusion Language Model
feizc/DiT-MoE
Scaling Diffusion Transformers with Mixture of Experts
lxa9867/ImageFolder
XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation
OscarXZQ/weight-selection
ashawkey/objaverse_filter
naive filter of objaverse
MonoFormer/MonoFormer
The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"
DAMO-NLP-SG/DiGIT
[NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Yuqifan1117/CaCao
This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World" (Accepted by ICCV 2023)
Share14/ShareGemini
zfu006/TAP
Official PyTorch implementation of "Temporal As a Plugin: Unsupervised Video Denoising with Pre-Trained Image Denoisers" in ECCV 2024.
jiachenlei/maskdm
Lirui-Zhao/ELF
Official Pytorch Implementation of Boosting the Cross-Architecture Generalization of Dataset Distillation through an Empirical Study
daniel-gallo/naraim
fabienfrfr/PixelBytes
Catching Insights in Unified Multimodal Sequences