This is the workflow:
Scan parts of an image that considerably overlap.
They must all be approximately oriented correctly.
The program uses the overlapping areas for reconstruction
of the layout of the parts.
If all parts are in the directory part
then in the best case you can simply run:
patch-image --output=collage.jpeg part/*.jpeg
If you get blurred areas, you might enable an additional rotation correction:
patch-image --finetune-rotate --output=collage.jpeg part/*.jpeg
It follows an overview of how the program works. It implies some things you should care about when using the program.
The program runs three phases:
-
Orientate each image part individually
-
Find overlapping areas in the parts and reconstruct the part positions within the big image
-
Blend the parts in order to get the big image
The first phase orientates each part such that horizontal structures become perfectly aligned. Only the brightness channel of the image is analysed. Horizontal structures can be text or the border of the image. This also means that you should orientate the parts horizontally, not vertically. I also recommend not to mix horizontal and vertical scanned parts since the horizontal and vertical resolution of your scanner might differ slightly. However, it should be fine to rotate the image source by 180° and rotate it back digitally, before feeding it to the patch-image program.
Options for the first phase:
-
--maximum-absolute-angle
: Maximum angle to test for. -
--number-angles
: Number of angles minus one to test between negative and positive maximum angle. -
--hint-angle
: If the search for horizontal structures does not yield satisfying results for an image part, you may prepend the--hint-angle
option with the wanted angle to the image path.
In the second phase the program looks for overlapping parts between all pairs of images. For every pair it computes a convolution via a Fourier transform. Only the brightness channel of the image is analysed.
-
--pad-size
: Computing a convolution of two big images may exceed your graphics memory. To this end, images are shrunk before convolution. The pad size is the size in pixels after shrinking that holds 2x2 shrunken image parts. After determination of the distance between the shrunken parts the matching is repeated on a non-reduced clip of the original image part, in order to get precise coordinates. -
--minimum-overlap
: There must be a minimum of overlap in order to accept that the images actually overlap. The overlap is measured as a portion of the image part size. -
--maximum-difference
: The maximum allowed mean difference within an overlapping area of two overlapping images. If the difference is larger, then the program assumes that the parts do not overlap. -
--smooth
: It is important to eliminate a brightness offset, that is, big black and big white areas should be handled equally. To this end the image is smoothed and the smoothed image is subtracted from the original one. This option allows to specify the degree (radius) of the smoothing. I don't think you ever need to touch this parameter. -
--output-overlap
: Writes images for all pairs of image parts. These images allow you to diagnose where the matching algorithm failed. It may help you to adjust the matching parameters. In the future we might add an option to ignore problematic pairs.
Since in the first phase every image part is oriented individually, it may happen that the part orientations don't match. This would result in blurred areas in the final collage. In order to correct this, you can run phase two in an extended mode, that also re-evaluates the part orientations. The orientation of the composed image is then determined by the estimated orientation of the first image.
Options:
-
--finetune-rotate
: Enables the extended overlapping mode. The option--output-overlap
will then be ignored. -
--number-stamps
: The extended mode selects many small clips in the overlapping area and tries to match them. We call these clips /stamps/. This option controls the number of stamps per overlapping area minus one. -
--stamp-size
: Size of a square stamp in pixels.
The third phase composes a big image from the parts. These parts are weighted such that the part boundaries cannot be seen anymore and differences in brightness are faded into another. The downside is that the superposition may lead to blur.
Options:
-
--output
: Path of the output JPEG image with the weighted collage. -
--output-hard
: Alternative output of a JPEG collage where the image parts are simply averaged. You will certainly see bumps in brightness at the borders of the image parts. This output may be mostly useful to promote the great weighting algorithm employed by--output
. -
--output-distance-map
: The weight for every pixel is chosen according to the distance to an image part boundary that lies within other parts. The rationale is that the weight shall become zero when the pixel is close to a position that will be affected by a disruption otherwise. This option allows to emit the distance map for every image part. -
--distance-gamma
: If the distances are used for weighting as they are, the program fades evenly between the overlapping image parts over the entire overapping area. This may mean that the overlapping area is blurred. Raising the distance to a power greater than one reduces the area of blur. The downside is that it also reduces the area for adaption of differing brightness.
Our LLVM implementation provides an additional way to assemble the image parts. The already known weighting approach tries to blend across all the overlapping area. This can equalize differences in brightness. The downside is that imperfectly matching image parts lead to blurred content in the overlapping area. An alternative algorithm tries to make the overlapping as small as possible and additionally performs blending where it hurts least. More precisely, parts are blended where they differ least. However, if the brightness of the image parts differ then the blending boundaries may become visible.
Options:
-
--output-shaped
: Path of the output JPEG image with smoothly blended image parts along curves of low image difference. -
--output-shaped-hard
: Like before but the image parts are not smoothly faded. Instead, every pixel belongs to exactly one original image part. This is more for debugging purposes than of practical use. -
--output-shape
: Emit the smoothed mask of each image part used for blending. -
--output-shape-hard
: Emit the non-smoothed masks. -
--shape-smooth
: Smooth radius of the image masks. The higher, the smoother is the blending between parts.
General options:
--quality
: JPEG quality percentage for writing the images.
If the program does not work correctly on a particular set of images or if you want to archive what it has done, then the following options may help you.
-
--output-state=state-%s.csv
: Use of this option let the program generate three files that correspond to the stages of the image processing. For each file the program replaces the placeholder%s
by an identifier for the particular stage. For the example above the generated files are:-
state-angle.csv
: After the rotation detection the program emits this file containing the image paths and their angles in degrees. -
state-relation.csv
: This file contains for all possible pairs of image parts whether the program found that they overlap and, if yes, what their relative offset is. -
state-position.csv
: From the relative offsets the program computes the absolute part positions. This CSV file is essentially an extension ofstate-angle.csv
by the absolute positions. However, in--finetune-rotate
mode the angles may differ from the ones instate-angle.csv
.
-
-
--state=state-position.csv
: You can feed the program with various parameters such that the program does not need to compute them itself. This can be useful for repeating a previous computation, speedup a computation or improve a previous computation. The idea is to take eitherstate-angle.csv
orstate-position.csv
as generated by--output-state
. You may omit any column or re-order them, or you may clear individual cells. At least the Image column must be present. The most simple CSV file would thus be a text file where the first line isImage
and each of the following lines contains one image path. The images listed in the CSV file are loaded additionally to the files given at the command-line. A complete CSV file would contain the columnsImage
,Angle
,X
,Y
, and in--finetune-rotate
mode additionallyDAngle
.A missing column or empty cell means that the program computes the parameter itself. You may even fix say an X coordinate but let the program compute the corresponding Y coordinate. I do not know whether there is a useful application of this feature, but the feature is a natural result of how the computation works.
You may use the CSV file to set the rotation angle of the assembled image. To this end, choose an image part where you know the angle for sure and add the
Angle
to the image's row while omitting theAngle
for all other images.If an image is grossly turned away from horizontal alignment, that is, rotated by, say, more than 2 degree, or the image does not contain a horizontal structure to align to, then the automatic angle detection will fail. In this case you might use the Angle column to specify the angle directly.
In
--finetune-rotate
mode theAngle
column means the following: If anAngle
cell is set then the rotation angle is fixed both in the first rotation detection phase, and in the subsequent rotation correction. If anAngle
cell is empty then the rotation angle is both determined by horizontal structures in the first phase and then adjusted in the subsequent correction phase. Further, if you want to say that all angles are only hints that may be adjusted in the second phase, then you should add an emptyDAngle
column. If you want to exclude some angles from correction, then enter a0
at the corresponding cells in theDAngle
column.Summarized: If both
Angle
andDAngle
columns are present, then theAngle
column specifies the angle as determined in the first phase and theDAngle
column specifies the angle difference as determined in the second phase. You may enter values different from0
in theDAngle
column. E.g., setting1
means that any angle determined in the first phase is unconditionally increased by 1 degree in the second phase. I do not know, whether this can ever be useful, but providing the columnsAngle
andDAngle
this way seemed to be the most natural way to support all sensible use cases. If only theAngle
column is available this is equivalent to adding aDAngle
column and setDAngle=0
ifAngle
is set and leaveDAngle
empty ifAngle
is empty.If you list images both on the command-line and in a CSV file, then both lists are concatenated. I recommend that you use either command-line arguments or a CSV file to specify the image list. You can easily get this wrong when running the program with a command-line file list first and then rerunning the computation starting with a saved
state.csv
file. -
--relations=state-relation.csv
: The tablerelation.csv
contains pairs of images and their relative offsets. The most simple use case is to take thestate-relation.csv
file as generated by--output-state
and alter or delete rows that you find inappropriate. The common columns areImageA
,ImageB
, andRel
.Rel
can be:-
empty: The user wants the program to determine whether the image parts overlap.
-
X
: The user knows that the image parts overlap. -
-
: The user knows that the image parts do not overlap.
Without
--finetune-rotate
the columnsDX
andDY
follow. They tell the offset of the image parts to each other. In--finetune-rotate
mode there are the additional columnsXA
,YA
,XB
,YB
. They contain coordinates of corresponding points in the image pairs. There are usually many such correspondences for one image pair and they are listed below header lines for the image pairs.It is uncommon to have both a
state.csv
file with absolute positions and arelation.csv
file, though the program supports it. If there is a pair of image parts where both images have fixed positions instate.csv
and there is also a fixed offset registered inrelation.csv
, then the fixed positions win. The differing offset between the image parts are seen as an unfortunate impossibility to optimally match the optimization goals.There is another difficulty in mixing fixed positions together with fixed offsets: For a pair of image parts with fixed positions the program does not match images in order to compute the relative offsets of the parts because it would not affect the final positions, anyway. This optimization makes the program skip the phase of determining image relations completely, if all part positions are fixed. This is the wanted effect. The unwanted effect is that since no relative offsets are computed, no relative offsets can be written to
state-relation.csv
as requested by--output-state
. That is, re-running the program with--state
using a CSV file as previously emitted by--output-state
while using--output-state
anew will generate a newstate-relation.csv
file with less information than before or no information at all.Files listed in
relation.csv
are not loaded. Instead of this, the program checks consistency withstate.csv
and files listed as command-line arguments. -
If the program misses related image pairs
or recognizes unrelated images as overlapping,
you may first watch the overlap differences the program prints.
You may then adapt the threshold via the --maximum-difference
option.
The total image might contain repetitive elements
that make unrelated image parts look like they overlap.
So, if there is no clear threshold
or you find the above procedure inappropriate,
you may alternatively let the program emit a relation.csv
file
using the --output-state
option.
You can then edit relation.csv
file
and re-run the program with the option --relations=relation.csv
.