puckbee's Stars
kelseyhightower/nocode
The best way to write secure and reliable applications. Write nothing; deploy nowhere.
janishar/mit-deep-learning-book-pdf
MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
Theano/Theano
Theano was a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It is being continued as PyTensor: www.github.com/pymc-devs/pytensor
aaron-xichen/pytorch-playground
Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)
jbush001/NyuziProcessor
GPGPU microprocessor architecture
fengbintu/Neural-Networks-on-Silicon
This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.
flame/how-to-optimize-gemm
Maratyszcza/NNPACK
Acceleration package for neural networks on multi-core CPUs
SVF-tools/SVF
Static Value-Flow Analysis Framework for Source Code
Rock-100/FaceKit
[CVPR 2018] Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
NJU-ProjectN/nemu
NJU EMUlator, a full system x86/mips32/riscv32/riscv64 emulator for teaching
andravin/wincnn
Winograd minimal convolution algorithm generator for convolutional neural networks.
iBreaker/book
收集专业书籍 <欢迎提交>
LvNA-system/labeled-RISC-V
cdl-saarland/rv
RV: A Unified Region Vectorizer for LLVM
seung-lab/znn-release
Multi-core CPU implementation of deep learning for 2D and 3D sliding window convolutional networks (ConvNets).
elongbug/llvm-cookbook
llvm-cookbook samples
EBD-CREST/nsparse
Sparse matrix computation library for GPU
KastnerRG/spector
Spector: An OpenCL FPGA Benchmark Suite
tanakamura/instruction-bench
instruction-bench
arbenson/fast-matmul
Fast matrix multiplication
karrenberg/wfv
IMPORTANT NOTICE: This implementation is long outdated. The new libwfv will be released soon. Whole-Function Vectorization is an algorithm that transforms a scalar function in such a way that it computes W executions of the original code in parallel using SIMD instructions (W is the target architecture's SIMD width). This implementation of the algorithm is a language- and platform-independent code transformation that works on low-level intermediate code given by an arbitrary control-flow graph in SSA form (LLVM bitcode).
jszhujun2010/Clang-Basic-Tutorial
Basic Clang library, LibTooling and Plugin
davidebarbieri/spgpu
spGPU library for sparse linear algebra on GPUs
canercandan/linear-algebra
A linear algebra framework in C++ along with a layout abstraction for parallelization paradigms. It provides operators to compute dense and sparse matrices with generically designed scalar, complex, vector and matrix types. At this time, the framework supports the libraries CUDA, CUBLAS, CUSP, CUSPARSE for parallel computing on GPGPU.
eerbil/Code-Selection-For-SpMV-Using-Deep-Learning
Reimplementation of the paper "A Code Selection Mechanism Using Deep Learning" in Python.
kevinzhang334455/Scout
UR research Project
ryanh3nry/BarrettCUDA
BarrettCUDA is a fast(ish) implementation of finite field sparse matrix-vector multiplication (SpMV) for Nvidia GPU devices, written in CUDA C++. BarrettCUDA supports SpMV for matrices expressed in the 'compressed column storage' (CCS) sparse matrix representation over (i) the field of integers modulo an arbitrary multi-precision prime, or (ii) either of the binary fields GF(2^8) or GF(2^16).
AdamHarries/sparseharness
A harness/set of harnesses for executing spmv based algorithms from Lift
SumithraSriram/Sparse-Matrices
Implementation of Sparse Matrix Vector Multiplication using various Sparse Matrix storage formats