Pinned Repositories
turingas
Assembler for NVIDIA Volta and Turing GPUs
cub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
cutlass
CUDA Templates for Linear Algebra Subroutines
cutlass
CUDA Templates for Linear Algebra Subroutines
gpu-arch-microbenchmark
Dissecting NVIDIA GPU Architecture
mlx
MLX: An array framework for Apple silicon
snippets
TNN
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
sjfeng1999's Repositories
sjfeng1999/gpu-arch-microbenchmark
Dissecting NVIDIA GPU Architecture
sjfeng1999/snippets
sjfeng1999/cutlass
CUDA Templates for Linear Algebra Subroutines
sjfeng1999/mlx
MLX: An array framework for Apple silicon