seth-lu

Pinned Repositories

3D-Machine-Learning
A resource repository for 3D machine learning
0 0 00
cmake-examples
Useful CMake Examples
Language:CMake0 0 00
collaborative-attention
Code for Multi-Head Attention: Collaborate Instead of Concatenate
Language:Python0 0 00
convGemm
The convGemm library performs the convolution operation using an implicit im2row or im2col over a GEMM operation with matrices in either the NHWC or NCHW format, respectively.
Language:C1 0 00
cs344
Introduction to Parallel Programming class code
Language:Cuda0 0 00
cuda_sgemm
Language:Cuda0 0 00
DimReduce
Language:JavaScript0 0 00
HPC-Knowledge-Library
Language:C++3 2 00
Im2win
Language:C++13 2 10
Represent-ML-algorithm-by-Tensor-Algebra
Machine learning algorithm, Tensor， ITensor
2 2 40

seth-lu's Repositories

seth-lu/Im2win
Language:C++13 2 10
seth-lu/HPC-Knowledge-Library
Language:C++3 2 00
seth-lu/Represent-ML-algorithm-by-Tensor-Algebra
Machine learning algorithm, Tensor， ITensor
2 2 40
seth-lu/convGemm
The convGemm library performs the convolution operation using an implicit im2row or im2col over a GEMM operation with matrices in either the NHWC or NCHW format, respectively.
Language:C1 0 00
seth-lu/3D-Machine-Learning
A resource repository for 3D machine learning
0 0 00
seth-lu/cmake-examples
Useful CMake Examples
Language:CMake0 0 00
seth-lu/collaborative-attention
Code for Multi-Head Attention: Collaborate Instead of Concatenate
Language:Python0 0 00
seth-lu/cs344
Introduction to Parallel Programming class code
Language:Cuda0 0 00
seth-lu/cuda_sgemm
Language:Cuda0 0 00
seth-lu/DimReduce
Language:JavaScript0 0 00
seth-lu/EfficientConvolution
Implementation of an efficient convolution between 3D tensors and 4D tensors.
Language:C++0 0 00
seth-lu/Fastor
A lightweight high performance tensor algebra framework for modern C++
seth-lu/how-to-optimize-gemm
Language:C0 0
seth-lu/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
Language:Cuda
seth-lu/implicit_gemm_convolution
Language:C0 0
seth-lu/ITensor
A C++ library for efficient tensor network calculations
seth-lu/laser
The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
seth-lu/Learn-CUDA-Programming
Learn CUDA Programming, published by Packt
seth-lu/LibtorchTutorials
This is a code repository for pytorch c++ (or libtorch) tutorial.
Language:C++0 0
seth-lu/libxsmm
Library for specialized dense and sparse matrix operations, and deep learning primitives.
seth-lu/ls110082
Config files for my GitHub profile.
seth-lu/mtensor
A C++ Cuda Tensor Lazy Computing Library
Language:C++0 0
seth-lu/ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Language:C++0 0
seth-lu/NN-CUDA-Example
Several simple examples for popular neural network toolkits calling custom CUDA operators.
seth-lu/PyTorch-BayesianCNN
Bayesian Convolutional Neural Network with Variational Inference based on Bayes by Backprop in PyTorch.
seth-lu/pytorch-handbook
pytorch handbook是一本开源的书籍，目标是帮助那些希望和使用PyTorch进行深度学习开发和研究的朋友快速入门，其中包含的Pytorch教程全部通过测试保证可以成功运行
seth-lu/pytorch-tutorial
PyTorch Tutorial for Deep Learning Researchers
Language:Python0 0
seth-lu/splatt
The Surprisingly ParalleL spArse Tensor Toolkit.
Language:C0 0
seth-lu/visdom
A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.
Language:Python0 0
seth-lu/zh-google-styleguide
Google 开源项目风格指南 (中文版)
Language:Makefile0 0