Maratyszcza/NNPACK

transform_strategy

BransonMarvin opened this issue · 1 comments

Hello,

Thanks for contributing this awesome piece of software to the opensource world!
I have one question regarding the convolution inference functionality.
I've seen in many places the use of a variable transform_strategy, what is the purpose of this variable? If I used nnp_convolution_transform_strategy_reuse with a winograd convolution would I still get the correct results?

The meaning of this argument has changed over time, and here I will describe the current version (which I think is here to stay). Convolution transform strategy lets you pre-compute transformation of the filters (kernels) and then re-use precomputed values rather than recomputing then every time NNPACK function is called. Such pre-computation only makes sense during inference, as in training filters change after each batch update, and their transforms would need to be recomputed too.

Currently, pre-computation is only supported for fast convolution algorithms (Fourier transform and Winograd transform). Pre-computation (it should be really called pre-packing for these algorithms) of filters in implicit GEMM and direct convolution algorithms is currently not supported, and nnp_convolution_inference calls would return nnp_status_unsupported_transform_strategy, but I will likely implement it in the future. Please use nnp_status_unsupported_transform_strategy error code to detect yet-unsupported algorithms.

Now, to pre-compute filter transforms with a fast convolution algorithm, call nnp_convolution_inference with transform_strategy = nnp_convolution_transform_strategy_precompute, workspace_buffer = NULL, and non-NULL workspace_size. input, bias, output, and kernel arguments can be NULL in this call, but all other arguments must have the same values as you would later pass to compute convolution. If this call returned with nnp_status_success, *workspace_size has the size of transformed filter tensor. Now, allocate memory of at least*workspace_size bytes, and at least 64 bytes aligned for transformed filter tensor. Call nnp_convolution_inference with transform_strategy = nnp_convolution_transform_strategy_precompute, size of allocated buffer for transformed filter tensor in *workspace_size, and pointer to the buffer in workspace_buffer. input, bias, and output arguments can be NULL in this call, but kernel and all other arguments must have the same values as you would later pass to compute convolution. If this call returned with nnp_status_success, the NNPACK pre-computed transformed filters and stored them in your buffer. Now, you can call nnp_convolution_inference as many times as you want with transform_strategy = nnp_convolution_transform_strategy_reuse and pass a pointer to the buffer with pre-computed filter transforms in the kernel argument.

Note that this method saves you computation, but it increases memory footprint. Normally, NNPACK would compute filter transforms in slices, and use only enough memory for one slice. Pre-computing filter transforms would require allocating enough memory for the whole transformed filter tensor.