hgl71964

PhD student in Machine Learning System

University of CambridgeCambridge

Pinned Repositories

CGRA-Mapper
Language:C0 0 00
cmu-15445-databases
Language:C++0 1 00
containers
Language:Shell0 1 00
cuasmrl
Language:C++0 1 00
CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
Language:Python0 0 00
cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++00
dagbo
Bayesian optimisation with semi-parametric DAG models
Language:Python0 0 00
dot_config
dot file for configuration
Language:Vim Script00
fast-route
Language:Python0 1 00
rmcts
Language:Rust1 1 01

hgl71964's Repositories

hgl71964/rmcts
Language:Rust1 1 01
hgl71964/tvm-benchmark
Language:Python1 1 00
hgl71964/unity-tvm
Language:Python1 1 00
hgl71964/CGRA-Mapper
Language:C0 0 00
hgl71964/cmu-15445-databases
Language:C++0 1 00
hgl71964/containers
Language:Shell0 1 00
hgl71964/cuasmrl
Language:C++0 1 00
hgl71964/CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
Language:Python0 0 00
hgl71964/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++00
hgl71964/dagbo
Bayesian optimisation with semi-parametric DAG models
Language:Python0 0 00
hgl71964/dot_config
dot file for configuration
Language:Vim Script00
hgl71964/fast-route
Language:Python0 1 00
hgl71964/gpu-arch-microbenchmark
Dissecting NVIDIA GPU Architecture
hgl71964/huggingnft
Generate NFT or train new model in just few clicks! Train as much as you can, others will resume from checkpoint!
Language:Jupyter Notebook0 0
hgl71964/minitorch
a minimal torch implementation; featured auto-diff in Pytorch style
Language:Python2 0
hgl71964/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python0 0
hgl71964/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python0 0
hgl71964/SIP
Language:C++1 0
hgl71964/task-graph
Language:C++1 0
hgl71964/triton_fork
Development repository for the Triton language and compiler
Language:C++0 0
hgl71964/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0