Add API for concatenating images
mheppner opened this issue · 6 comments
With the current thunder.images.fromtif()
call, either a single image or a glob of images in a directory can be loaded. To load specific images, you have to do something like this:
rdds = [
thunder.images.fromtif('path1'),
thunder.images.fromtif('path2'),
]
bigRdd = sc.union(rdds)
data = thunder.images.fromrdd(bigRdd)
In addition to the other methods mentioned in #331, an API could be added to concatenate image objects, like this:
data1 = thunder.images.fromtif('path1')
data2 = thunder.images.fromttif('path2')
data = data1.concatenate(data2)
@mheppner
I think this use case is handled by thunder.images.fromlist()
, which takes a list of files and a function for loading each file.
So in your example, you would do something like this:
# list of paths to images
im_paths = ['path_to_image_1', 'path_to_image_2']
# a function that takes a path and returns image data
def tif_loader(path):
from skimage.io import imread
return imread(path)
data = thunder.images.fromlist(im_paths, accessor=tif_loader)
Does this work for you? (Personally I like this a lot more than the thunder.images.fromtif()
approach...)
That could work too, but it takes away from some of the magic of using .fromtif()
. I could supply my own accessor, but I would really just be copying the one already in .fromtif()
, which feels a bit odd to me. I guess this would ultimately come down to changing .frompath()
. Regardless, I can either give it a single file, an entire directory of files, or a glob pattern, but there's no way to load just a specific set of files.
The use case I have is to search a database to get paths of tifs to load into a thunder set. The only way of doing this is either to copy all the files into a temporary directory, or the method I mentioned above of joining all the RDDs. #331 is going to be more useful than this issue, but I figured I'd add it anyways. I think it still might be useful to concatenate images together though.
What magic is there in using .fromtif()
? Maybe I'm coming from a different perspective because I work with a variety of image formats, but the .fromlist()
constructor seems about as simple and direct as you can get -- feed it a list of images, populated however you like (e.g., by searching a database) and then specify how to load the files with your own accessor. You could copy the accessor in .fromtif()
, but you could more simply use any other function for loading .tif
files. Doesn't this satisfy your use case?
Yes, that does fit the use case, but why copy something that already exists? Why can't .fromtif()
simply accept a list as well as a string?