Pinned Repositories
ColossalAI
Making large AI models cheaper, faster and more accessible
Benchmark
ColossalAI
Making large AI models cheaper, faster and more accessible
lwpf2
Sunway CPE performance tools
PUNASfilter
Source code for the parallel ungapped-alignment-featured seed verification (PUNAS) algorithm for next generation sequence alignment.
SPECTR
SPECTR is designed to improve the throughput of DNA error correction for Illumina reads. Our design is based on the memory-efficient BLESS algorithm but is optimized towards AVX-512-based CPUs, Xeon Phi many-cores (both KNC and KNL), and heterogeneous compute clusters
SWMapper
SWMapper is a read mapper on Sunway architecture
Xu-Kai's Repositories
Xu-Kai/lwpf2
Sunway CPE performance tools
Xu-Kai/SPECTR
SPECTR is designed to improve the throughput of DNA error correction for Illumina reads. Our design is based on the memory-efficient BLESS algorithm but is optimized towards AVX-512-based CPUs, Xeon Phi many-cores (both KNC and KNL), and heterogeneous compute clusters
Xu-Kai/SWMapper
SWMapper is a read mapper on Sunway architecture
Xu-Kai/Benchmark
Xu-Kai/PUNASfilter
Source code for the parallel ungapped-alignment-featured seed verification (PUNAS) algorithm for next generation sequence alignment.
Xu-Kai/ColossalAI
Making large AI models cheaper, faster and more accessible
Xu-Kai/ColossalChat
Xu-Kai/dlbook_exercises
Exercises for the Deep Learning textbook at www.deeplearningbook.org
Xu-Kai/hexo-blog
this is the source code for my blog
Xu-Kai/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Xu-Kai/gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Xu-Kai/GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
Xu-Kai/KGen
Fortran Kernel Generator
Xu-Kai/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Xu-Kai/notes
Xu-Kai/notes-1
记录平时遇到的一些问题以及基础知识
Xu-Kai/public_assets
Storing publicly available assets such as images, animations and texts
Xu-Kai/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
Xu-Kai/s-aligner-paper
Xu-Kai/SDU_thesis_template_for_postgraduate
山东大学硕/博士研究生毕业论文模板
Xu-Kai/SW-GRIST
Xu-Kai/TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
Xu-Kai/XBLESS
Xu-Kai/Xu-Kai.github.io