Archtecture-aware Neural Networks

A simple deep learning framework that optimizes task scheduling and memory usuage on different CPU/GPU architectures.

I started this as a research project trying to explore optimization on task scheduling and memory usuage with DNN workloads on different architecutures.
The ISSUE pages (some of those marked as "closed") contain partial preliminary results and interesting performance patterns.
A classmate in GPU computing classes added GPU support too.

branch	build status
master
feng

Instructions

git submodule update --init
mkdir build
cd build

We use mkl for cpu gemm().

source /opt/intel/bin/compilervars.sh intel64
source /opt/intel/mkl/bin/mklvars.sh intel64

Then build with

cmake -DUSE_MKL=on -DAWNN_USE_FLT32=on ..

in the builddir, run

../scripts/build_stampede2.sh

When mkl is not avaible install openblas and build with -DUSE_OPENBLAS=on

sudo ./install-apt.sh
cmake -DUSE_OPENBLAS=on -DAWNN_USE_FLT32=on ..