Pinned Repositories
ark
A GPU-driven system framework for scalable AI applications
cutlass
CUDA Templates for Linear Algebra Subroutines
fairscale
PyTorch extensions for high performance and large scale training.
FasterTransformer
Transformer related optimization, including BERT, GPT
GoldMiner_Verilog_VGA
CS294O
jihaoxin.github.io
Killing-the-Duck-Curve---KAUST-CS244-Project
L-GreCo
AN EFFICIENT AND GENERAL FRAMEWORK FOR LAYERWISE-ADAPTIVE GRADIENT COMPRESSION
llama3
The official Meta Llama 3 GitHub site
msccl
Microsoft Collective Communication Library
JihaoXin's Repositories
JihaoXin/Killing-the-Duck-Curve---KAUST-CS244-Project
JihaoXin/ark
A GPU-driven system framework for scalable AI applications
JihaoXin/cutlass
CUDA Templates for Linear Algebra Subroutines
JihaoXin/fairscale
PyTorch extensions for high performance and large scale training.
JihaoXin/FasterTransformer
Transformer related optimization, including BERT, GPT
JihaoXin/GoldMiner_Verilog_VGA
CS294O
JihaoXin/jihaoxin.github.io
JihaoXin/L-GreCo
AN EFFICIENT AND GENERAL FRAMEWORK FOR LAYERWISE-ADAPTIVE GRADIENT COMPRESSION
JihaoXin/llama3
The official Meta Llama 3 GitHub site
JihaoXin/msccl
Microsoft Collective Communication Library
JihaoXin/nccl
Optimized primitives for collective multi-GPU communication
JihaoXin/nccl_reduction
JihaoXin/Splash-Detection
JihaoXin/terraform-provider-tencentcloud
Terraform tencentcloud provider
JihaoXin/testweb
JihaoXin/Tinker
KAUST CS294V Project
JihaoXin/SVD
JihaoXin/tlrmvm