Pinned Repositories
Diffusion-RWKV
Scaling RWKV-Like Architectures for Diffusion Models
Dimba
Transformer-Mamba Diffusion Models
DiS
Scalable Diffusion Models with State Space Backbone
DiT-MoE
Scaling Diffusion Transformers with Mixture of Experts
FluxMusic
Text-to-Music Generation with Rectified Flow Transformers
Gradient-Free-Textual-Inversion
Gradient-Free Textual Inversion for Personalized Text-to-Image Generation
IEA
Image Editing Anything
MLE-LLaMA
Multi-language Enhanced LLaMA
Video-Stable-Diffusion
Generate consistent videos with stable diffusion models
Visual-LLaMA
Open LLaMA Eyes to See the World
feizc's Repositories
feizc/FluxMusic
Text-to-Music Generation with Rectified Flow Transformers
feizc/MLE-LLaMA
Multi-language Enhanced LLaMA
feizc/DiT-MoE
Scaling Diffusion Transformers with Mixture of Experts
feizc/Visual-LLaMA
Open LLaMA Eyes to See the World
feizc/DiS
Scalable Diffusion Models with State Space Backbone
feizc/IEA
Image Editing Anything
feizc/Diffusion-RWKV
Scaling RWKV-Like Architectures for Diffusion Models
feizc/Dimba
Transformer-Mamba Diffusion Models
feizc/Video-Stable-Diffusion
Generate consistent videos with stable diffusion models
feizc/Gradient-Free-Textual-Inversion
Gradient-Free Textual Inversion for Personalized Text-to-Image Generation
feizc/Vespa
Video Diffusion State Space Models
feizc/Visual-ChatGLM
Open ChatGLM Eyes to See the World
feizc/AIO
All In One: General Multimodal Large Language Model
feizc/Matrix-Analysis-and-Application
References and coding homework in matrix analysis and application course in UCAS
feizc/Cleaned-Webvid
Use strategy to achieve clean webvid-10m dataset
feizc/MaskGMT
Masked generative music transformer
feizc/arXiv-MM
Multimodal dataset for arXiv
feizc/Visual-MOSS
Makes MOSS model understand visual information
feizc/AAT
Attention-Aligned Transformer for Image Captioning
feizc/DSC
descriptive synthetic captions in dalle3
feizc/feizc
feizc/LQMA
Language Quantized Masked AutoEncoders
feizc/Union
Unifying Language-Image Pre-training via Single-Tower Transformer
feizc/CLKA
Cross Lingual Knowledge Alignment for Stable Diffusion Models
feizc/DiT
Efficient Vision Transformers with Dynamic Token Routing
feizc/MoE-MLLM
Mixture-of-Experts for Multimodal Large Language Models
feizc/resume
A jekyll based resume
feizc/ViD
Text-to-Image Diffusion Models as Refined Visual Learners
feizc/ISFT
One Sample is All You Need: Distilling Supervised Fine-tuning by Extrapolating from a Single Image-Text Pair
feizc/LLM-benchmark
real LLM FLOPS on various training framework