/canny-edge-parallel

A parallel implementation of the Canny Edge Detection algorithm using OpenMP, CUDA and OpenCL.

Primary LanguageC++GNU General Public License v3.0GPL-3.0

Parallel Implementation of Canny Edge Detection

A parallel implementation of the Canny Edge Detection algorithm using OpenMP, CUDA and OpenCL. 640x480 640x480

There are 4 Visual Studio 2019 projects with 4 different implementation - OpenMP, CUDA, OpenCL, Serial

Prerequisite

  • CUDA Toolkit
  • OpenMP
  • OpenCL
  • Visual Studio

Speedup Result

image

More detailed documentation can be found in canny_doc.pdf

Limitation and Future works

image

By using Nvidia Visual Profiler, we can see the apply_gaussian_filter kernel takes up 59.1%, and apply_sobel_filter kernel takes up 33.1% of the computation time. Besides, there is no kernel concurrency. We can introduce kernel concurrency by separating the apply_sobel_filter kernel into two kernels, which can be sobel_seperable_pass_x and sobel_seperable_pass_y. These two kernels use two different directions of the Sobel filter. They will have no dependency so they can be executed concurrently.

In addition, we also found out that Sobel and Gaussian filter is separable functions. In the current implementation, we have not utilised separable filter; therefore, a filter of window size M×M computes M2 operations per pixel. If we utilised separable functions correctly, the cost would be reduced to computing M + M = 2M operations. This is a two step process where the intermediate results from the first separable convolutionis stored and then convolved with the second separable filterto produce the output. We believed the performance would be significantly improved by utilising separable functions.

We can also use multiple streams to parallelise the process of memcpy and kernel execution, although the improvement may not be too significant, it is one thing that can be done in order to push performance to its limit.

Contribute

  • Fork the project.
  • Make feature addition or bug fix.
  • Report Issues
  • Send me a pull request.