gty111
Ph.D. student of Sun Yat-Sen University, prior intern @Tencent. Simulaters, GPU, architecture, HPC
Sun Yat-sen UniversityGuangzhou
Pinned Repositories
SYSU-ARCH
SYSU-ARCH is a LAB that focuses on the use and extending of simulators.
accel-sim-framework
This is the top-level repository for the Accel-Sim framework.
BCI
ConvNN
A simple CNN training framework support on CPU and GPU(CUDNN)
DistVAE
A parallelism VAE avoids OOM for high resolution image generation
GEMM_MMA
Optimize GEMM with tensorcore step by step
GEMM_WMMA
GEMM by WMMA (tensor core)
PTX-EMU
PTX-EMU is a simple emulator for CUDA program.
SimpleUseGpgpuSim
GPGPU-SIM 使用篇
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
gty111's Repositories
gty111/PTX-EMU
PTX-EMU is a simple emulator for CUDA program.
gty111/GEMM_MMA
Optimize GEMM with tensorcore step by step
gty111/SimpleUseGpgpuSim
GPGPU-SIM 使用篇
gty111/GEMM_WMMA
GEMM by WMMA (tensor core)
gty111/ConvNN
A simple CNN training framework support on CPU and GPU(CUDNN)
gty111/BCI
gty111/accel-sim-framework
This is the top-level repository for the Accel-Sim framework.
gty111/DuiPai
gty111/eChat
gty111/gpgpu-sim_distribution
GPGPU-Sim provides a detailed simulation model of a contemporary GPU running CUDA and/or OpenCL workloads and now includes an integrated (and validated) energy model, GPUWattch.
gty111/gty111
gty111/gty111.github.io
gty111/human-eval-infilling
Code for the paper "Efficient Training of Language Models to Fill in the Middle"
gty111/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
gty111/Mouse-Controler
gty111/resume
An elegant \LaTeX\ résumé template. 大陆镜像 https://gods.coding.net/p/resume/git
gty111/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
gty111/vllm-pub
A high-throughput and memory-efficient inference and serving engine for LLMs
gty111/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters