AI Compiler Study

Pinned Repositories

flash-attention
Fast and memory-efficient exact attention
Language:Python00
flux
Official inference repo for FLUX.1 models
Language:Python8 0 01
kernels
Language:Python00
minRF-ONNX
Minimal implementation of scalable rectified flow transformers, based on SD3's approach
Language:Jupyter Notebook5 0 00
nexfort
OneDiff compiler infrastructure using torch Inductor
Language:Python10
quant_dit_models
Language:Python0 1 00
quanto
Language:Python1 1 10
test_attn
Testing and benchmarking different attention implementations and backends
Language:Python00
triton-kernels
Triton kernels for Flux
Language:Python170
xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Language:Python00

AI Compiler Study's Repositories

ai-compiler-study/triton-kernels
Triton kernels for Flux
Language:Python170
ai-compiler-study/flux
Official inference repo for FLUX.1 models
Language:Python8 0 01
ai-compiler-study/minRF-ONNX
Minimal implementation of scalable rectified flow transformers, based on SD3's approach
Language:Jupyter Notebook5 0 00
ai-compiler-study/nexfort
OneDiff compiler infrastructure using torch Inductor
Language:Python10
ai-compiler-study/quanto
Language:Python1 1 10
ai-compiler-study/cutlass-kernels
Language:Cuda00
ai-compiler-study/flash-attention
Fast and memory-efficient exact attention
Language:Python00
ai-compiler-study/kernels
Language:Python00
ai-compiler-study/quant_dit_models
Language:Python0 1 00
ai-compiler-study/test_attn
Testing and benchmarking different attention implementations and backends
Language:Python00
ai-compiler-study/unet.cu
UNet diffusion model in pure CUDA
Language:Cuda0 0 01
ai-compiler-study/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Language:Python00
ai-compiler-study/flux-tinygrad-opt
Optimize Flux on tinygrad
Language:Python