Integration with deep learning frameworks
Maratyszcza opened this issue · 21 comments
NNPACK needs to be integrated into deep learning frameworks to deliver performance benefits to end-users. If you would like to work on such integration, please comment here.
@ajtulloch contributed basic integration of NNPACK's convolutional layers into Caffe, which is now available at Maratyszcza/caffe-nnpack. This project is not complete: it only exposes NNPACK's training-optimized forward propagation in convolution layers to Caffe, but it is a good starting point.
As discussed, https://gist.github.com/ajtulloch/92c8e7c8bbc877df9657 extends the Caffe integration to Pooling and InnerProduct layers. I can send a diff to that, and also add the backpropagation in convolutional layers (modulo fixing the grad check issue mentioned)
https://github.com/ajtulloch/caffe/tree/nnpack-pr is the branch (adds InnerProduct, Pooling, a shared threadpool, etc) that I'm proposing to be upstreamed into Caffe. Does that branch look good to you @Maratyszcza ?
Thanks @ajtulloch! I was working on porting your diff to caffe-nnpack
, but now I'd rather switch to your branch. Some problems I see:
*.hpp
files insrc/caffe/layers
should probably go intoinclude/caffe/layers
- In the
NNPackPoolingLayer
is is better to check the return code ofnnp_max_pooling_output
to detect situations not handled by NNPACK rather then hard-code them in the bindings. Note: if NNPACK returns anything different fromnnp_status_success
, the outputs of a function call are unchanged. - The code below is not C++11. gcc and clang allow it, but older versions of Microsoft C++ would complain. However, not sure if Caffe maintainers care about Windows or MSVC
const nnp_padding input_padding = {
.top = static_cast<size_t>(this->pad_h_),
.right = static_cast<size_t>(this->pad_w_),
.bottom = static_cast<size_t>(this->pad_h_),
.left = static_cast<size_t>(this->pad_w_)};
In case this hasn't already been done, it's probably a good idea to create separate issues in each of the GitHub repositories for each framework:
I'd also love to help work on the integrations, but I probably don't have the required technical knowledge to be of much use (can you point me to any resources that may help me?)
Thanks for creating this!
/cc @edgarriba
I created some node bindings for NNPACK (https://www.npmjs.com/package/node-nnpack). I am using it as backend in a JavaScript-based deep learning framework.
Thanks @andreasgal, I added a link to NNPACK readme
@Maratyszcza If you have time update reference to tiny-dnn (no tiny-cnn anymore!). Now NNPACK is fully integrated in master branch
@Maratyszcza Also I See that you have started to introducing neon and fp16. Do you plan to go also on lower precision like gemmlowp?
@edgarriba Thanks, updated the link and description.
@bhack If there is a user (i.e. high-level framework) for this feature, I will implement it. AFAIK, right now only Tensorflow supports 8-bit fixed-point inference, but its tensor formats are not compatible with NNPACK.
Also Nvidia Tensorrt 2 support INT8
I am trying to build the nnpack-pr
branch of ajtulloch/caffe, but It seems the integration must be updated. Among others, I am getting complaints about the constant values:
src/caffe/layers/nnpack_convolution_layer.cpp:39:7: error: use of undeclared identifier 'nnp_convolution_transform_strategy_reuse'; did you mean 'nnp_convolution_transform_strategy_precomputed'?
nnp_convolution_transform_strategy_reuse;
Ok, reuse
probably corresponds to precomputed
, but what is the new value for nnp_convolution_transform_strategy_recompute
: block_based or tuple_based? (src/caffe/layers/nnpack_convolution_layer.cpp:43)
Following the question: Is there a toolkit that can make us of NNPACK out of the box? Because for Caffe2 I can see only sources in caffe2/contrib/nnpack
, but no integration of them.
I can update that branch with the new NNPACK API, give me a few days. For caffe2, just building and setting engine="NNPACK" in the OperatorDef is sufficient. For Torch, https://github.com/szagoruyko/nnpack.torch/blob/master/SpatialConvolution.lua should work as well.
@dprylipko also we support it in tiny-dnn
https://github.com/tiny-dnn/tiny-dnn/blob/master/tiny_dnn/core/kernels/conv2d_op_nnpack.h
@ajtulloch Andrew, can I expect you are fixing the integration in the nearest future? This is the last piece of puzzle I don't have yet.
Will fullconnected backward be available?
@bhack Fully-connected backward is not yet implemented in NNPACK
Yes I know we are using NNPack... :) but will?
It is not hard to do, but I'm oversubscribed with tasks