bethgelab/decompose

Model fitting fails on large datasets

Closed this issue · 5 comments

vilim commented

Thank you for developing this package and publishing it in this easily-installable and open way!
I was trying out the method on our lightsheet data, and the results on small patches look promising. However, if I try to apply it to a bigger chunk, TensorFlow complains:
ValueError: Cannot create a tensor proto whose content is larger than 2GB.
I already have algorithms in place to stitch spatial filers extracted from overlapping patches from another method (CNMF), however due to the large signal contamination from out-of-focus planes in the lightsheet, taking into account bigger areas with more planes would yield better results, if the computational costs do not become prohibitively large.

Thank you for trying out our framework.

Currently a single filter bank cannot be larger than 2GB. Could that be a problem in your case? Could you please let me know the shape of the input data (X) and the number of sources (K)?

vilim commented

This is quite likely to be true, the input data is 1275 frames, 1266720 voxels and I have set the number of sources to 1000, which I guess all together was to ambitious to try at once. Memory-wise the computer should be able to handle it though (there is 128 GB of RAM), I guess this is a tensorflow-related limit?
Does the run-time of the algorithm scale linearly with the dimensions of the tensors?

@aboettcher : apparently this is a well known limit of tensorflow that can be circumvented by first creating a placeholder and then feeding the array during initialisation time (see https://stackoverflow.com/questions/35394103/initializing-tensorflow-variable-with-an-array-larger-than-2gb). Would that solution work for us as well?

Given the dimensions provided by @vilim the problem should be the size of the filter banks. They are currently initialized using tensorflow.constant with a (random) numpy array as initialization. That technique unnecessarily adds large constants to the graph. The approach pointed out by @wielandbrendel should avoid that problem. I will update the code accordingly.

Commits 0ea7801 and 5ad43fd should fix that issue.