Maratyszcza/NNPACK

Integration with deep learning frameworks

Maratyszcza opened this issue · 21 comments

NNPACK needs to be integrated into deep learning frameworks to deliver performance benefits to end-users. If you would like to work on such integration, please comment here.

@ajtulloch contributed basic integration of NNPACK's convolutional layers into Caffe, which is now available at Maratyszcza/caffe-nnpack. This project is not complete: it only exposes NNPACK's training-optimized forward propagation in convolution layers to Caffe, but it is a good starting point.

As discussed, https://gist.github.com/ajtulloch/92c8e7c8bbc877df9657 extends the Caffe integration to Pooling and InnerProduct layers. I can send a diff to that, and also add the backpropagation in convolutional layers (modulo fixing the grad check issue mentioned)

https://github.com/ajtulloch/caffe/tree/nnpack-pr is the branch (adds InnerProduct, Pooling, a shared threadpool, etc) that I'm proposing to be upstreamed into Caffe. Does that branch look good to you @Maratyszcza ?

Thanks @ajtulloch! I was working on porting your diff to caffe-nnpack, but now I'd rather switch to your branch. Some problems I see:

  1. *.hpp files in src/caffe/layers should probably go into include/caffe/layers
  2. In the NNPackPoolingLayer is is better to check the return code of nnp_max_pooling_output to detect situations not handled by NNPACK rather then hard-code them in the bindings. Note: if NNPACK returns anything different from nnp_status_success, the outputs of a function call are unchanged.
  3. The code below is not C++11. gcc and clang allow it, but older versions of Microsoft C++ would complain. However, not sure if Caffe maintainers care about Windows or MSVC
const nnp_padding input_padding = {
      .top = static_cast<size_t>(this->pad_h_),
      .right = static_cast<size_t>(this->pad_w_),
      .bottom = static_cast<size_t>(this->pad_h_),
      .left = static_cast<size_t>(this->pad_w_)};

In case this hasn't already been done, it's probably a good idea to create separate issues in each of the GitHub repositories for each framework:

I'd also love to help work on the integrations, but I probably don't have the required technical knowledge to be of much use (can you point me to any resources that may help me?)
Thanks for creating this!

I created some node bindings for NNPACK (https://www.npmjs.com/package/node-nnpack). I am using it as backend in a JavaScript-based deep learning framework.

Thanks @andreasgal, I added a link to NNPACK readme

@Maratyszcza If you have time update reference to tiny-dnn (no tiny-cnn anymore!). Now NNPACK is fully integrated in master branch

bhack commented

@Maratyszcza Also I See that you have started to introducing neon and fp16. Do you plan to go also on lower precision like gemmlowp?

@edgarriba Thanks, updated the link and description.

@bhack If there is a user (i.e. high-level framework) for this feature, I will implement it. AFAIK, right now only Tensorflow supports 8-bit fixed-point inference, but its tensor formats are not compatible with NNPACK.

bhack commented

We are also supporting 8bit in tinydnn with the @wangyida effort.

bhack commented

Also Nvidia Tensorrt 2 support INT8

I am trying to build the nnpack-pr branch of ajtulloch/caffe, but It seems the integration must be updated. Among others, I am getting complaints about the constant values:

src/caffe/layers/nnpack_convolution_layer.cpp:39:7: error: use of undeclared identifier 'nnp_convolution_transform_strategy_reuse'; did you mean 'nnp_convolution_transform_strategy_precomputed'?
      nnp_convolution_transform_strategy_reuse;

Ok, reuse probably corresponds to precomputed, but what is the new value for nnp_convolution_transform_strategy_recompute: block_based or tuple_based? (src/caffe/layers/nnpack_convolution_layer.cpp:43)

Following the question: Is there a toolkit that can make us of NNPACK out of the box? Because for Caffe2 I can see only sources in caffe2/contrib/nnpack, but no integration of them.

I can update that branch with the new NNPACK API, give me a few days. For caffe2, just building and setting engine="NNPACK" in the OperatorDef is sufficient. For Torch, https://github.com/szagoruyko/nnpack.torch/blob/master/SpatialConvolution.lua should work as well.

@ajtulloch Andrew, can I expect you are fixing the integration in the nearest future? This is the last piece of puzzle I don't have yet.

bhack commented

Will fullconnected backward be available?

@bhack Fully-connected backward is not yet implemented in NNPACK

bhack commented

Yes I know we are using NNPack... :) but will?

It is not hard to do, but I'm oversubscribed with tasks