Added base implementation with fbf37cd. I need to verify some of the results in more details, but it looks correct at first glance. We can easily apply the preconditioner to the resulting matrix A (of size (M-1)^2 x (M-1)^2)
I will also update the writeup to include in the paper.