LeiWang1999

Practice makes perfect.

Institute of Computing Technology, UCASPeking

Pinned Repositories

tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python11.3k 381 3.3k3.4k
AICS-Course
《智能计算系统 AI Computing Systems》习题答案、实验答案、课程笔记
Language:C++178 2 327
AutoGPTQ.tvm
GPTQ inference TVM kernel
Language:Cuda34 3 21
DigitalAlarmClock
njtech digital design. a fpga digital alarm system with Nexys A7 100T
Language:Verilog34 1 213
FPGA
帮助大家进行FPGA的入门，分享FPGA相关的优秀文章，优秀项目
3.5k 58 4624
tvm_gpu_gemm
play gemm with tvm
Language:Cuda78 4 19
VehicleFlowDetection
Implement of vehicle flow statistics based on tensorflow and yolo3 with pyqt5 GUI.
Language:Python18 3 35
ZYNQ-NVDLA
NVDLA (An Opensource DL Accelerator Framework) implementation on FPGA.
Language:Verilog269 8 2856
BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
Language:Python162 11 1415
nnfusion
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
Language:C++933 44 204155

LeiWang1999's Repositories

LeiWang1999/ZYNQ-NVDLA
NVDLA (An Opensource DL Accelerator Framework) implementation on FPGA.
Language:Verilog269 8 2856
LeiWang1999/tvm_gpu_gemm
play gemm with tvm
Language:Cuda78 4 19
LeiWang1999/AutoGPTQ.tvm
GPTQ inference TVM kernel
Language:Cuda34 3 21
LeiWang1999/VehicleFlowDetection
Implement of vehicle flow statistics based on tensorflow and yolo3 with pyqt5 GUI.
Language:Python18 3 35
LeiWang1999/leiblog.wang
My New Blog Powered by HEXO http://leiblog.wang
Language:HTML5 2 02
LeiWang1999/rocblas-benchmark
Language:C++5 2 1
LeiWang1999/BitBLAS
Language:Python4
LeiWang1999/memfusion_artifact
Language:Python4
LeiWang1999/cv
resume.
Language:TeX3 2 0
LeiWang1999/mlc-benchmark
Language:Python3
LeiWang1999/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python3 1 01
LeiWang1999/cutlass
Language:C++2 2 0
LeiWang1999/Ladder
@DataStructures_Cbased I'm Coming！
Language:Python2 1 0
LeiWang1999/Roller
Build and Train AlexNet with PyTorch and Predict with TVM and Pytorch, compare the performance between them
Language:Python2 2 0
LeiWang1999/_cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++1 1 0
LeiWang1999/MSBitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
Language:Python1
LeiWang1999/nnfusion
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
Language:C++1 1 01
LeiWang1999/vLLM
Language:Python1 1 0
LeiWang1999/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Language:Python1 0
LeiWang1999/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python1 0
LeiWang1999/AutoGPTQ_nf
Language:Python1 0
LeiWang1999/gptq_faster
Faster 3bit CUDA Kernel for gptq.
Language:Python1 0
LeiWang1999/LeiWang1999
1 01
LeiWang1999/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Language:Python1 0
LeiWang1999/nmsparse
Language:HTML1 01
LeiWang1999/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Language:Python1 0
LeiWang1999/ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Language:Python1 0
LeiWang1999/relax
Language:Python1 0
LeiWang1999/vllm-bitblas
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
LeiWang1999/Welder_artifacts
OSDI 2023 WElder artifacts
Language:Python1 0