Image unshredding using a TSP solver

Introduction

Yesterday I saw a fun demo by Nayuki using simulated annealing to reconstruct photographs whose columns have been shuffled.

For example, the photograph Blue Hour in Paris (CC licensed by Falcon® Photography):

Blue hour in Paris

is shuffled to produce:

Blue hour in Paris, shuffled

and the simulated annealing algorithm (starting temperature 4000, 1 billion iterations) reconstructs this:

Blue hour in Paris, reconstructed using simulated annealing

Subsequently Sangaline showed that the images can be reconstructed faster and more effectively using a simple greedy algorithm to pick the most-similar column at each step. The greedy algorithm produces this:

Blue hour in Paris, reconstructed using a greedy algorithm

which is quite close to the original, though you can see some misplaced columns in the sky at the right.

Image unshredding is an instance of the Travelling Salesman Problem

Our task is to piece the columns of pixels together so that, over all, adjacent columns are as similar as possible. Think of the columns as being nodes in a weighted graph, with the edge-weight between two columns being a dissimilarity measure. Then we are looking for a Hamiltonian path of minimum weight in the graph. So it is an instance of the Travelling Salesman Problem.

(A small technical note: since we want a Hamiltonian path rather than a Hamiltonian cycle, we add a dummy node to the graph that has weight-0 edges to all the other nodes. If we can find a least-weight Hamiltonian cycle on this augmented graph, we remove the dummy node to obtain a least-weight Hamiltonian path on the original graph.)

This project

This project uses a fast approximate solver for the Travelling Salesman Problem to reconstruct the images quickly and perfectly.

Blue hour in Paris, reconstructed using LKH

You will notice that the image is flipped, but otherwise reconstructed perfectly. It is impossible in general to distinguish an image from its flipped version when the columns have been shuffled, and all the algorithms mentioned here produce flipped reconstructions half the time.

Apart from that, I believe this algorithm can correctly reconstruct all the images in Nayuki’s demo.

The dissimilarity measure matters

One interesting thing I found is that the result is sensitive to the dissimilarity measure used. I have used the same measure as the other projects mentioned here: the sum of the absolute values of the differences in the R/G/B channels, summed over all pixels in the column. If instead we use the square rather than the absolute value, the image is reconstructed incorrectly as follows:

Blue hour in Paris, reconstructed using LKH

This is not a failure of the TSP algorithm: in fact this mangled image has a better score than the original, using the sum-of-squares measure!

Running the code

  • Clone the repository
  • Run make to download and shuffle the images, and download and compile LKH.
  • Now you can run make reconstruct to reconstruct the images from their shuffled versions using LKH.

You can also run make nayuki to reconstruct the images using Nayuki’s simulated annealing code.

Prerequisites

You will need a working build environment (Make and a C compiler), and curl is used to download files from the web. You also need libpng >= 1.6 (which the Makefile assumes to be in /usr/local, but that is easy to change). The simpler bits of image manipulation are done using Python 2 and require the Python Imaging Library or a compatible fork such as Pillow.

Double-shuffling

It is perhaps not surprising, but rather striking, that if we shuffle the columns and then the rows to obtain a really scrambled-looking image like this:

Blue hour in Paris, shuffled by columns and then by rows

that it can nevertheless be reconstructed perfectly by applying the algorithm twice, first to the rows and then to the columns. Of course now the result may be flipped vertically as well as horizontally, but in this case I happened to get lucky twice and it came out in the same orientation as the original:

Blue hour in Paris, reconstructed from the double-shuffled version

The code for the double-shuffling and reconstruction is in the branch double-shuffling. Switch to that branch and run make double_reconstruct to try it.