Wenzha0Wu's Stars
numpy/numpy
The fundamental package for scientific computing with Python.
tinygrad/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
karpathy/micrograd
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
onnx/tensorflow-onnx
Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
facebookresearch/TensorComprehensions
A domain specific language to express machine learning workloads.
gpgpu-sim/gpgpu-sim_distribution
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as well as a performance visualization tool, AerialVisoin, and an integrated energy model, GPUWattch.
OAID/AutoKernel
AutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。
hidet-org/hidet
An open-source efficient deep learning framework/compiler, written in python.
pybind/cmake_example
Example pybind11 module built with a CMake-based build system
spcl/dace
DaCe - Data Centric Parallel Programming
KEKE046/mlir-tutorial
Hands-On Practical MLIR Tutorial
Xilinx/mlir-aie
An MLIR-based toolchain for AMD AI Engine-enabled devices.
xdslproject/xdsl
A Python Compiler Design Toolkit
UIUC-ChenLab/scalehls
A scalable High-Level Synthesis framework on MLIR
dmlc/HalideIR
Symbolic Expression and Statement Module for new DSLs
microsoft/triton-shared
Shared Middle-Layer for Triton Compilation
maestro-project/maestro
An analytical cost model evaluating DNN mappings (dataflows and tiling).
cornell-zhang/allo
Allo: A Programming Model for Composable Accelerator Design
openppl-public/ppl.llm.kernel.cuda
Meinersbur/ppcg
Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)
Lewuathe/mlir-hello
MLIR Sample dialect
pku-liang/AMOS
Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators
inducer/islpy
Python wrapper for isl, an integer set library
Meinersbur/isl
Integer Set Library (source repository: http://repo.or.cz/w/isl.git)
Par4All/par4all
Par4All is an automatic parallelizing and optimizing compiler (workbench) for C and Fortran sequential programs
ftynse/clint
Chunky Loop Interaction
ulysseB/telamon
A framework to find good combinations of optimizations for computational kernels on GPUs.
yaoyaoding/hidet-artifacts
This repository is the artifact of paper "Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs".