whlzy's Stars
karpathy/LLM101n
LLM101n: Let's build a Storyteller
LTH14/rcg
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
FoundationVision/OmniTokenizer
OmniTokenizer: one model and one weight for image-video joint tokenization.
deepseek-ai/DeepSeek-Coder-V2
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
feifeibear/long-context-attention
Sequence Parallel Attention for Long Context LLM Model Training and Inference
zhaoyue-zephyrus/bsq-vit
[BSQ-ViT] Image and Video Tokenization with Binary Spherical Quantization
lucidrains/titok-pytorch
Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"
Q-Future/CMC-Bench
[LMM + codec] A new paradigm of visual signal compression!
hsiehjackson/RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
mit-han-lab/distrifuser
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
OpenGVLab/InternVideo2
leptonai/search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
idootop/mi-gpt
🏠 将小爱音箱接入 ChatGPT 和豆包,改造成你的专属语音助手。
HaoyiZhu/PointCloudMatters
Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
hjq133/piccolo-embedding
code for piccolo embedding model from SenseTime
gojasper/flash-diffusion
Official implementation of ⚡ Flash Diffusion ⚡: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation
QwenLM/Qwen2
Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
Vchitect/Latte
Latte: Latent Diffusion Transformer for Video Generation.
Doraemonzzz/vector-quantize
tianweiy/DMD2
lllyasviel/Omost
Your image is almost there!
jzhang38/EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
discus0434/aesthetic-predictor-v2-5
SigLIP-based Aesthetic Score Predictor
christophschuhmann/improved-aesthetic-predictor
CLIP+MLP Aesthetic Score Predictor
FoundationVision/VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
sony/ctm
borisdayma/dalle-mini
DALL·E Mini - Generate images from a text prompt
HigherOrderCO/Bend
A massively parallel, high-level programming language
allenai/unified-io-2