transform_strategy
BransonMarvin opened this issue · 1 comments
Hello,
Thanks for contributing this awesome piece of software to the opensource world!
I have one question regarding the convolution inference functionality.
I've seen in many places the use of a variable transform_strategy, what is the purpose of this variable? If I used nnp_convolution_transform_strategy_reuse with a winograd convolution would I still get the correct results?
The meaning of this argument has changed over time, and here I will describe the current version (which I think is here to stay). Convolution transform strategy lets you pre-compute transformation of the filters (kernels) and then re-use precomputed values rather than recomputing then every time NNPACK function is called. Such pre-computation only makes sense during inference, as in training filters change after each batch update, and their transforms would need to be recomputed too.
Currently, pre-computation is only supported for fast convolution algorithms (Fourier transform and Winograd transform). Pre-computation (it should be really called pre-packing for these algorithms) of filters in implicit GEMM and direct convolution algorithms is currently not supported, and nnp_convolution_inference
calls would return nnp_status_unsupported_transform_strategy
, but I will likely implement it in the future. Please use nnp_status_unsupported_transform_strategy
error code to detect yet-unsupported algorithms.
Now, to pre-compute filter transforms with a fast convolution algorithm, call nnp_convolution_inference
with transform_strategy = nnp_convolution_transform_strategy_precompute
, workspace_buffer = NULL
, and non-NULL workspace_size
. input
, bias
, output
, and kernel
arguments can be NULL
in this call, but all other arguments must have the same values as you would later pass to compute convolution. If this call returned with nnp_status_success
, *workspace_size
has the size of transformed filter tensor. Now, allocate memory of at least*workspace_size
bytes, and at least 64 bytes aligned for transformed filter tensor. Call nnp_convolution_inference
with transform_strategy = nnp_convolution_transform_strategy_precompute
, size of allocated buffer for transformed filter tensor in *workspace_size
, and pointer to the buffer in workspace_buffer
. input
, bias
, and output
arguments can be NULL
in this call, but kernel
and all other arguments must have the same values as you would later pass to compute convolution. If this call returned with nnp_status_success
, the NNPACK pre-computed transformed filters and stored them in your buffer. Now, you can call nnp_convolution_inference
as many times as you want with transform_strategy = nnp_convolution_transform_strategy_reuse
and pass a pointer to the buffer with pre-computed filter transforms in the kernel
argument.
Note that this method saves you computation, but it increases memory footprint. Normally, NNPACK would compute filter transforms in slices, and use only enough memory for one slice. Pre-computing filter transforms would require allocating enough memory for the whole transformed filter tensor.