Pure image processing operations
Closed this issue · 2 comments
Hello,
I need NEON accelerated simple image processing operations such as resize, gaussian/box blur, and basic morphology (dilation erosion etc). I managed to use NEScale for resizing (would have been nicer if bicubic interpolation was supported).
As far as I can see, for blurring, my only option is to use NEGEMMConv2d (combine it with logical & element wise ops for morphology). NEGEMMConv2D has extra bias parameter, I might work around that but with bringing extra bias addition cost. I wanted to ask for advice if this is the right path or maybe if there already exists support for such operations (I've seen older issues mentioning gaussian blur support, maybe I could not find them in the latest release 23.08).
Thanks in advance
I have a follow up question about dilation in an older version (21.02) if that's ok @morgolock
I have the following method but it gives the wrong output on a simple case (input is 8x8 all zeros, but a single pixel in the center. I'm expecting a dilated 3x3 region around it):
namespace armc = arm_compute;
void dilate(
uint8_t* input_image,
uint8_t* output_image,
int height,
int width
)
{
armc::Image inp;
armc::Image out;
bool import = true;
if(import)
{
inp.allocator()->info().init(
armc::TensorShape(width, height),
armc::Format::U8,
armc::Strides(1, width),
0,
width * height
);
out.allocator()->info().init(
armc::TensorShape(width, height),
armc::Format::U8,
armc::Strides(1, width),
0,
width * height
);
inp.allocator()->import_memory(input_image);
out.allocator()->import_memory(output_image);
}
else
{
auto inf = armc::TensorInfo(width, height, armc::Format::U8);
inp.allocator()->init(inf);
out.allocator()->init(inf);
}
armc::NEDilate mop{};
mop.configure(&inp, &out, armc::BorderMode::UNDEFINED);
if(!import)
{
inp.allocator()->allocate();
out.allocator()->allocate();
fill_arm_image(inp, input_image, width, height);
}
mop.run();
if(!import)
copy_from_arm_image(out, output_image, width, height);
This results in all zeros in output array (no change, stays as initialized) when I import from external buffer. What I expect it to do is to use no padding in the input and shrink the execution window (following the guide here ) Is there anything I'm missing?
It works fine if I allocate the memory and fill it with my data after the configuration step.
Thanks again!
We removed the CV functions from ACL in the release v21.05, the focus of the library now is machine learning. We do not maintain or support old versions of the library. The best alternative is to use a different library like OpenCV, you can cross-compile it for aarch64 and has good performance.
Regarding the example you shared about calling to import_memory()
, I think there may be a problem in the way you initialize the tensor info. There should be no need for you to specify the strides, you can see how it's done in https://github.com/ARM-software/ComputeLibrary/blob/main/tests/validation/NEON/UNIT/TensorAllocator.cpp#L59
TensorInfo info(TensorShape(24U, 16U, 3U), 1, DataType::F32);
// Allocate memory buffer
const size_t total_size = info.total_size();
auto data = std::make_unique<uint8_t[]>(total_size);
// Negative case : Import nullptr
Tensor t1;
t1.allocator()->init(info);
ARM_COMPUTE_ASSERT(!bool(t1.allocator()->import_memory(nullptr)));
ARM_COMPUTE_ASSERT(t1.info()->is_resizable());
If you don't import the memory and allocate it instead you get the correct results?
Hope this helps,