abria/TeraStitcher

Clarification on two-level folder hierarchy

Closed this issue · 8 comments

Hello.
I am newbie in microscopy imaging and I want to use your wonderful and free software to stitch some large images. Thank you for making it public.

I have a 8x7 mosaic of tiled data that I want to stitch. Each tile is 3D tiff with size 2160x2560x695. As I understand from the wiki, I have to use TiledXY|3Dseries for volin-plugin. My script to do the stitching is the following,


TERA=TeraStitcher-portable-1.11.6-Linux/terastitcher

$TERA --import --volin=prep/ --volin_plugin="TiledXY|3Dseries" --volout=prep/ --projout=prep/myxml1.xml --ref1=X --ref2=-Y --ref3=Z --vxl1=1.2 --vxl2=1.2 --vxl3=5 --imin_plugin=tiff3D

export USECUDA_X_NCC=1
$TERA --displcompute --projin=prep/myxml1.xml --projout=prep/myxml2.xml --oV=512 --oH=432 --sV=25 --sH=25 --algorithm=MIPNCC --imin_channel=all --subvoldim=

$TERA --displproj --projin=prep/myxml2.xml --projout=prep/myxml3.xml

$TERA --displthres --projin=prep/myxml3.xml --projout=prep/myxml4.xml --threshold=0.1

$TERA --placetiles --projin=prep/myxml4.xml --projout=prep/myxml5.xml

$TERA --merge --projin=prep/myxml5.xml --volout=prep/ --volout_plugin="TiledXY|2Dseries" --resolutions=0 --slicewidth=-1 --sliceheight=-1 --slicedepth=-1 --imin_channel=all --imout_plugin=tiff2D --imout_format=tif --imout_depth=16 --stitchablesonly --oV=512 --oH=432 --sV=0 --sH=0 --algorithm=SINBLEND

My 3D image tiles are acquired in the following manner (from 0 to 55, 8x7 mosaic)
0 1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 ....
..
..
49 50 51 52 53 54 55

My input folder is called "prep/" and it has the following structure,

[roys5@cn0625 prep]$ ls prep/
000000/ 000001/ 000002/ 000003/ 000004/ 000005/ 000006/ 000007/
[roys5@cn0625 DS4]$ ls prep/000000
000000_000000/ 000000_000001/ 000000_000002/ 000000_000003/ 000000_000004/ 000000_000005/ 000000_000006/
[roys5@cn0625 DS4]$ ls prep/000003
000003_000000/ 000003_000001/ 000003_000002/ 000003_000003/ 000003_000004/ 000003_000005/ 000003_000006/

As you can see, each folder (00000R) has its subfolders called 00000R_00000C (R=0,..,7,C=0,..,6, according to my 8x7 row x col mosaic).

I have put a 3D tiff (of depth 695) within each of these subfolders as shown below,
[roys5@cn0625 DS4]$ ls prep/000003/000003_000005/
im.tif

When I run the command, I got the following error,
ERROR: in VirtualVolume::extractCoordinates(...): unable to extract Z position from filename im.tif

Clearly I don't understand the concept of two-level folder structure. I thought the folder names FFFFFF_SSSSSS correspond to index (i.e. row_i, col_j). Could you please clarify how to properly set up the images?

Also the overlap is 20% as mentioned in the tiff info (see attached). So if the overlap is 20%, then should the -oV and -oH be 20% of the height and width, i.e. 512 and 432? Also what is the ideal sV and sH parameters for this case?
Thank you for your help.

info.txt

There is some misunderstanding in your use of the command line options of TeraStitcher as well as in the use of the two-level hierarchy of directories to perform the import step.
It is not easy to explain the right way to use these options (some information is available in the wiki, but apparently it is not enough).
I think I have enough information to build the xml import file you need and when I will sent it to you I will try to explain more details. However I need a few days to do that because I am very busy at the moment.
In the mean time, I may suggest you to look at the TeraTools Guide that you find on the right panel of the main wiki page. Although it is quite technical, it includes some additional information that may be perhaps useful.

There is some misunderstanding in your use of the command line options of TeraStitcher as well as in the use of the two-level hierarchy of directories to perform the import step.
It is not easy to explain the right way to use these options (some information is available in the wiki, but apparently it is not enough).
I think I have enough information to build the xml import file you need and when I will sent it to you I will try to explain more details. However I need a few days to do that because I am very busy at the moment.
In the mean time, I may suggest you to look at the TeraTools Guide that you find on the right panel of the main wiki page. Although it is quite technical, it includes some additional information that may be perhaps useful.

Thank you, I appreciate it. Please take your time.
If you could correct my command line options, that would be wonderful. I want to process a lot of data with different resolutions and dimensions, so running a batch job with command line arguments will be most helpful. Thank you.

Here my comments that could help you in improving your pipeline.

To fully understand what I will explain in the following you should keep in mind what reported in the TeraTools documentation, sections 1.2 and 1.3. Please have a look to them before continuing to read.

Comment about the --import step

In each second-level folder, the multi-page TIFF can simply be named:

im_0.tif

because TeraStitcher accepts only names that code the Z position of the image.

Another error I note in your command is the values you give to options --ref1 and --ref2. If I have interpreted well your arrangement of folders the correct options should be --ref1=Y and --ref2=X. Indeed, the first level of folders code the V position (corresponding to Y) and the second level of folders code the H position (corresponding to X). The minus sign means in --ref2 that left-most tiles in the tile matrix are those stored in the folders with the highest name (interpreted as a position) which does not seem your case.

Comment about the --displcompute step:

An overlap of 20% means that if in one dimension the image size is 2160 voxels, tiles overlap is 432 voxels and each tile is displaced of 1728 voxel with respect to the preceding one in the tile matrix. This information is computed at the import step if the conventions explained above are used, and it is used as a starting point in the align step. Options --oV, --oH should be used if for some reasons the overlap computed in the import step is wrong. Actually, in your case if you want to use row and column indices to code the folders names you could use these options to specify the correct starting position to be used in the align step. Note however that you have to change the name of the TIFF files adding an underscore and some digits (which is the reason why you ger the error).

Options --sV, --sH, --sD and --subvoldim should be used to help the align algorithm to find the right alignment. The search area depends on mechanical errors and on voxel size. Assume that we can establish an upper bound of 20 microns for mechanical errors along V (a similar procedure can be applied to other dimensions), if voxel size is 1.2 microns --sV should be set at least at 20/1.2 = 17 voxels. In other words --sV must increase as mechanical errors increase and voxel size decreases. As to --subvoldim, it has a default of 200 voxels which mean that in your case the alignments between each pair of tiles are computed for 4 substacks of size 174, 174, 174 and 173 voxels along Z. Increasing this parameter can reduce computation times, but if there are dense labelled structures in the image it may decrease the reliability of the alignments. Conversely, reducing this parameter increases computation times and can improve alignment reliability, but reducing it too much can prevent form correctly align along Z, especially if misalignments along Z can be remarkable. Hence a trade-off must be found. Typically, a value between 100 and 200 is a good choice.

Comment about --displthres step:

A threshold of 0.1 is too low, a value of 0.6-0.7 is more reasonable. A lower value can be occasionally used if overlapping regions have low contrast.

Comment about --merge step:

Using --oV, --oH, --sV, --sH does not make sense in this step.

Finally I noted that you plan to use CUDA code for the align step. Please cite the Frontiers in Neuroinformatics paper in your publications.

Hi Giulio,
Thanks so much for the clarification. I think I have understood the two level hierarchy concept.
I have created this xml file for a dataset of 3x9 mosaic (RxC), each tile having size 1211x2560x1018 (WxHxD), resolution 1208.179x1208.179x5000 nm, overlap ratio 10%. So I calculated the displacements as 1219x1.208179x0.90=1316um and 2560x1.208179x0.90=2873um. Then I have created folders with vertical and horizontal displacement names and put a 3D image inside each folder.
The the xml file is created using this command,
terastitcher --import --volin=test/ --volin_plugin="TiledXY|3Dseries" --volout=test/ --projout=test/myxml1.xml --ref1=Y --ref2=X --ref3=Z --vxl1=1.2 --vxl2=1.2 --vxl3=5 --imin_plugin=tiff3D
Could you please look into the attached xml file and let me know if this is correct? Thank you for your patience and I appreciate your help.

myxml1.xml.pdf

The xml seems correct.
Actually, when TeraStitcher generates an xml file, it adds at the beginning of the file the following standard lines:

But TeraStitcher should work even if these lines are missing.

Thank you. I am able to successfully run it. The previous xml file WAS created by Terastitcher though, I did not create it manually.

Dear Giulio,
Sorry to bother you again. Just to clarify, is there a way to use multiple GPUs? I didn't find any documentation to use multiple GPUs, although I am able to run it on a single GPU. Also, while running, it is taking very low amount (~300MB) of GPU memory, although my GPU has 12GB memory. Is that the correct behaviour? Thank you for your time.

The behavior is correct. Our tool is designed to minimize memory occupancy.

To use multiple GPUs, first download the last versions of the scripts from the "Multi-CPU with MPI" page of the wiki (https://github.com/abria/TeraStitcher/wiki/Multi-CPU-parallelization-using-MPI-and-Python-scripts). These script work also with Python3.

Second you should slightly change lines 286-289 as follows (the first two lines are new):

gpu = (myrank-1) % number_of_available_gpus
gpu_commands = "export USECUDA_X_NCC=1; export CUDA_VISIBLE_DEVICES="+ gpu +"; "
if debug_level > 0 :
execution_string = gpu_commands + prefix + list(input_file.values())[0] + " > " + "output_" + str(list(input_file.keys())[0]) + ".out"
else :
execution_string = gpu_commands + prefix + list(input_file.values())[0]

where number_of_available_gpus is the number of GPUs you have.