zhxfl's Stars
google/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
pjreddie/darknet
Convolutional Neural Networks
microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
mozilla/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
alibaba/MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
gperftools/gperftools
Main gperftools repository
wang-xinyu/tensorrtx
Implementation of popular deep learning networks with TensorRT network definition API
flashlight/wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit
halide/Halide
a language for fast, portable data-parallel computation
NVIDIA/thrust
[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
MegEngine/MegEngine
MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架
NVIDIA-AI-IOT/torch2trt
An easy to use PyTorch to TensorRT converter
mindspore-ai/mindspore
MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
keithito/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
herumi/xbyak
A JIT assembler for x86/x64 architectures supporting MMX, SSE (1-4), AVX (1-2, 512), FPU, APX, and AVX10.2
NVIDIA/cub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
mapillary/inplace_abn
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs
NervanaSystems/maxas
Assembler for NVIDIA Maxwell architecture
eddieantonio/imgcat
It's like cat, but for images.
onnx/onnx-mlir
Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure
zerollzeng/tiny-tensorrt
Deploy your model with TensorRT quickly.
NVIDIA/nv-wavenet
Reference implementation of real-time autoregressive wavenet inference
DavidDiazGuerra/gpuRIR
Python library for Room Impulse Response (RIR) simulation with GPU acceleration
NVIDIA/cnmem
A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory
daadaada/turingas
Assembler for NVIDIA Volta and Turing GPUs
PaddlePaddle/CINN
Compiler Infrastructure for Neural Networks
dmlc/nnvm-fusion
Kernel Fusion and Runtime Compilation Based on NNVM
ap-hynninen/cutt
CUDA Tensor Transpose (cuTT) library
XiuYuLi/deepcore_source_code
Subpart source code of of deepcore v0.7
jeng1220/cuGemmProf
A simple tool to profile performance of multiple combinations of GEMM of cuBLAS