choi95's Stars
corelab-src/occamy
Edgecortix-Inc/mera
A Heterogeneous Platform Deep Learning Compiler Framework from EdgeCortix
pku-liang/FlexTensor
Automatic Schedule Exploration and Optimization Framework for Tensor Computations
google-research-datasets/tpu_graphs
xiezhq-hermann/graphiler
Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into efficient execution plans.
tensor-compiler/taco
The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs
manya-bansal/mosaic
srush/Triton-Puzzles
Puzzles for learning Triton
jax-ml/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
google/xls
XLS: Accelerated HW Synthesis
fivosts/BenchPress
:orange_book: Source code for "BenchPress: A Deep Active Benchmark Generator", PACT 2022
InfiniTensor/InfiniTensor
hisrg/SNPE
Snapdragon Neural Processing Engine (SNPE) SDKThe Snapdragon Neural Processing Engine (SNPE) is a Qualcomm Snapdragon software accelerated runtime for the execution of deep neural networks. With SNPE, users can: Execute an arbitrarily deep neural network Execute the network on the SnapdragonTM CPU, the AdrenoTM GPU or the HexagonTM DSP. Debug the network execution on x86 Ubuntu Linux Convert Caffe, Caffe2, ONNXTM and TensorFlowTM models to a SNPE Deep Learning Container (DLC) file Quantize DLC files to 8 bit fixed point for running on the Hexagon DSP Debug and analyze the performance of the network with SNPE tools Integrate a network into applications and other code via C++ or Java
tenstorrent/tt-budabackend
Buda Compiler Backend for Tenstorrent devices
tenstorrent/tt-buda
Tenstorrent TT-BUDA Repository
triton-inference-server/tensorrt_backend
The Triton backend for TensorRT.
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
ARM-software/CMSIS-NN
CMSIS-NN Library
ROCm/MIOpen
AMD's Machine Intelligence Library
hsharma35/dnnweaver2
Open Source Specialized Computing Stack for Accelerating Deep Neural Networks.
NervanaSystems/ngraph
nGraph has moved to OpenVINO
microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
apache/tvm-vta
Open, Modular, Deep Learning Accelerator
Tiramisu-Compiler/tiramisu
A polyhedral compiler for expressing fast and portable data parallel algorithms
ONNC/onnc
Open Neural Network Compiler
kendryte/nncase
Open deep learning compiler stack for Kendryte AI accelerators ✨
danielholanda/LeFlow
Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks
EnzymeAD/Enzyme
High-performance automatic differentiation of LLVM and MLIR.
tenstorrent/tt-mlir
Tenstorrent MLIR compiler
uwsampl/SparseTIR
SparseTIR: Sparse Tensor Compiler for Deep Learning