Zarr sink multiprocessing issue
Closed this issue · 1 comments
manthey commented
The zarr sink doesn't work correctly with multiprocessing. Specifically, try this:
In python repl 1:
import large_image
import numpy as np
ts = large_image.new()
ts.addTile(np.zeros((1, 1, 1)), x=4095, y=4095, s=3, z=4) # our image is now 5,4096,4096,4
print(ts.largeImagePath) # note this name
In python repl 2:
import large_image
import numpy as np
ts = large_image.open(<name from above>)
ts.addTile(np.zeros((1, 1, 1)), x=2047, y=2047, s=3, z=2)
In python repl 1:
print(ts.metadata) # z is now only 3 long, not 5
Specifically, when we get the 0
level, we need to ensure that we honor the existing array and not resize it down.
manthey commented
This test shows the error:
def testMultiprocessZarrSink(tmp_path):
ts = large_image_source_zarr.new()
ts.addTile(np.zeros((1, 1, 1)), x=4095, y=4095, z=4)
path = ts.largeImagePath
subprocess.check_call([sys.executable, '-c', """import large_image_source_zarr
import numpy as np
ts = large_image_source_zarr.open('%s')
ts.addTile(np.zeros((1, 1, 1)), x=2047, y=2047, z=2)
""" % path])
assert ts.metadata['IndexRange']['IndexZ'] == 5
assert ts.sizeX == 4096
Curiously, adding the subprocess code as a function and using the multiprocessing.Process
call doesn't show the error.