Pinned Repositories
CUDA
Basic CUDA code for Visual Studio 2012 (vecAdd,Image Convolution,Histogram,reduction,scan etc..)
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
CUDALibrarySamples
CUDA Library Samples
ethminer
Ethereum miner with OpenCL, CUDA and stratum support
gpu-sum-reduction
CUDA implementation of the fundamental sum reduce operation. Aims to be as optimized as reasonable.
NCCL
Sample examples of how to call collective operation functions on multi-GPU environments. A simple example of using broadcast, reduce, allGather, reduceScatter and sendRecv operations.
rccl
ROCm Communication Collectives Library (RCCL)
rocBLAS
Next generation BLAS implementation for ROCm platform
rocSPARSE
Next generation SPARSE implementation for ROCm platform
TNN
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
mtxuhao's Repositories
mtxuhao/CUDA
Basic CUDA code for Visual Studio 2012 (vecAdd,Image Convolution,Histogram,reduction,scan etc..)
mtxuhao/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
mtxuhao/CUDALibrarySamples
CUDA Library Samples
mtxuhao/ethminer
Ethereum miner with OpenCL, CUDA and stratum support
mtxuhao/gpu-sum-reduction
CUDA implementation of the fundamental sum reduce operation. Aims to be as optimized as reasonable.
mtxuhao/NCCL
Sample examples of how to call collective operation functions on multi-GPU environments. A simple example of using broadcast, reduce, allGather, reduceScatter and sendRecv operations.
mtxuhao/rccl
ROCm Communication Collectives Library (RCCL)
mtxuhao/rocBLAS
Next generation BLAS implementation for ROCm platform
mtxuhao/rocSPARSE
Next generation SPARSE implementation for ROCm platform
mtxuhao/TNN
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.