Written by Jan Morez (University of Antwerp) as a part of a bachelor's thesis that uses a blockmatching algorithm to gather a statistical population for denoising single pixels in an image.
The findMatches CUDA kernel (along with Matlab helper functions) that will find matches for every pixel in an search window of size M by N. The algorithm is based on the following paper: http://www.mia.uni-saarland.de/Publications/zimmer-lnla08.pdf
The src/C/findMatches.cu file will have to be compiled into a .mex file . The comments in the .cu file will tell you which other inputs are required. If you intend on using this code and are confused because of the horrible documentation, you can contact me through jan.morez AT gmail DOT com