Pinned Repositories
18-447
big
CMU-18-447
cmulabs
Lab assignments in CMU 18-447 Computer Architecture course
colossal-test
ColossalAI
Making big AI models cheaper, easier, and scalable
Cubework-hzy1
MIPS-Sim
o-vit-FSDP
vit-colossal
ziyuhuang123's Repositories
ziyuhuang123/o-vit-FSDP
ziyuhuang123/vit-colossal
ziyuhuang123/big
ziyuhuang123/CMU-18-447
ziyuhuang123/colossal-test
ziyuhuang123/ColossalAI
Making big AI models cheaper, easier, and scalable
ziyuhuang123/Cubework-hzy1
ziyuhuang123/cupq
a CUDA implementation of a priority queue
ziyuhuang123/cusync
ziyuhuang123/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
ziyuhuang123/flash-attention
Fast and memory-efficient exact attention
ziyuhuang123/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
ziyuhuang123/for_pre_commit
ziyuhuang123/FSDP
ziyuhuang123/gpt2-1
ziyuhuang123/lsolver
:1234: lsolver is developed to optimize the performance of the sparse matrix triangular solve.
ziyuhuang123/o-vit-DDP
ziyuhuang123/resnet-DDP
ziyuhuang123/s-blas
This package includes the implementation for four sparse linear algebra kernels: Sparse-Matrix-Vector-Multiplication (SpMV), Sparse-Triangular-Solve (SpTRSV), Sparse-Matrix-Transposition (SpTrans) and Sparse-Matrix-Matrix-Multiplication (SpMM) for Single-node Multi-GPU (scale-up) platforms such as NVIDIA DGX-1 and DGX-2.
ziyuhuang123/sparse
Optimized sparse triangular solve.
ziyuhuang123/test
ziyuhuang123/test_action
ziyuhuang123/test_action_new
ziyuhuang123/test_github_action
ziyuhuang123/test_huoshan
ziyuhuang123/vit
ziyuhuang123/vit-DDP
ziyuhuang123/vit-FSDP
ziyuhuang123/vit-test
ziyuhuang123/vit_colossal