Pinned Repositories
accfft
A Massively Parallel FFT Library for CPU/GPU
AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
AutoTest
awesome-machine-learning-cn
机器学习资源大全中文版,包括机器学习领域的框架、库以及软件
baidu-allreduce
EZLippi.github.io
这是我的个人网站的源码,欢迎fork。
fastmoe
A fast MoE impl for PyTorch
how-to-optimize-gemm
taco
The Tensor Algebra Compiler (taco) computes tensor expressions on sparse and dense tensors
limin2021's Repositories
limin2021/DeployUseTensorRT
Deploy awesome computer vision model use tensorrt
limin2021/blislab
BLISlab: A Sandbox for Optimizing GEMM
limin2021/maxas
Assembler for NVIDIA Maxwell architecture
limin2021/limin2015.github.io
My New research life starting from blog.
limin2021/OpenCLCode
limin2021/Interview-Notebook
:books: 技术面试需要掌握的基础知识,持续更新中~
limin2021/cakechat
CakeChat: Emotional Generative Dialog System
limin2021/blocksparse
Efficient GPU kernels for block-sparse matrix multiplication and convolution
limin2021/stanford-tensorflow-tutorials
This repository contains code examples for the Stanford's course: TensorFlow for Deep Learning Research.
limin2021/Halide
a language for image processing and computational photography
limin2021/FoolNLTK
A Chinese Nature Language Toolkit
limin2021/xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow
limin2021/taco
The Tensor Algebra Compiler (taco) computes tensor expressions on sparse and dense tensors
limin2021/cutlass
CUDA Templates for Linear Algebra Subroutines
limin2021/loopy
A code generator for array-based code on CPUs and GPUs
limin2021/caffe2
Caffe2 is a lightweight, modular, and scalable deep learning framework.
limin2021/cosmos
c++11基础库
limin2021/TensorRT_Tutorial
limin2021/coding-exercises
My implementation of useful data structures, algorithms, as well as my solutions to programming puzzles.
limin2021/state-of-the-art-result-for-machine-learning-problems
This repository provides state of the art (SoTA) results for all machine learning problems. We do our best to keep this repository up to date. If you do find a problem's SoTA result is out of date or missing, please raise this as an issue or submit Google form (with this information: research paper name, dataset, metric, source code and year). We will fix it immediately.
limin2021/tensorflow
Computation using data flow graphs for scalable machine learning
limin2021/jetson-inference
Guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and Jetson TX1/TX2.
limin2021/nnvm
Bring deep learning to bare metal
limin2021/VersaPipe
A framework for pipelined computing on GPU
limin2021/CLCudaAPI
A portable high-level API with CUDA or OpenCL back-end
limin2021/dll
Deep Learning Library (DLL) for C++
limin2021/mobile-deep-learning
This research aims at simply deploying CNN(Convolutional Neural Network) on mobile devices, with low complexity and high speed.
limin2021/CMake
Mirror of CMake upstream repository
limin2021/sru
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
limin2021/Incremental-Network-Quantization
Caffe Implementation for Incremental network quantization