SPADE: SPArse DEep learning framework
This repository contains framework for efficient computation of sparse tensors in deep learning networks.
Running SPADE:
CC=icc make
Building TACO:
If run into this issue:
c++ unrecognized -std=c++14
error on the cluster. Do the following:
module load gcc/8.1.0 export CC=which gcc export CXX=which g++
Basically make sure CMake uses the right CC compiler.
For emitting code for CPU: ./build/bin/taco "a(i) = B(i,j) * c(j)"
./bin/taco "a(i) = B(i,j) * c(j)" -f=B:ds -i=B:/scratch/seth.k/spmv/data/MM/Baumann/Baumann.mtx -g=c:d -o=a:tmp.mtx -verify -print-concrete -time=2 -write-time=
Running pOSKI autotuner
python external/poski/bench/CodeGen_MBCSRRowMaj_Matmul.py
Testing :
spade/spmv ../data/test.mtx
spade/spmv
<= this tests the test.csv file which has a small sparse matrix
For installing the perf install linux-tools: then you would have to change some kernel files to allow the symbols to be run. There were issues running the perf as root , so try running perf without sudo and it should work fine.
vim /proc/sys/kernel/perf_event_paranoid
make this -1
sudo sh -c " echo 0 > /proc/sys/kernel/kptr_restrict"
Following are the kernels which were benchmarked with:
- CSR5 kernels these include the AVX2, AVX512, CUDA
- OSKI and pOSKI library
- CUDA and CuSPARSE libraries
- MKL intel libraries
- TACO both CPU and GPU
- Custom library
For CPU the comparison will be done with:
- Intel MKL and Inspector-Executor
- TACO CPU generated code
- pOSKI and OSKI libraries
- Custom
For GPU the comparison will be done with:
- CUDA
- CuSPARSE
- CUSP
- Custom
- TACO GPU kernel