Split full report if one file would have too many rows
Opened this issue · 5 comments
Reject a full report request if it should contain more than 50000 annotations for a single volume. Even though the full report Python script is now more memory efficient, such a report can generate extremely large temporary files (>1.5 GB for a volume with 80000 annotations) and can contain more than the number of rows that can be handled by Excel or Calc (if freehand polygons are used).
Even better would be a dynamic limit. Find out what the maximum number of rows is that Excel/Calc can handle. Then split the report into multiple files if the number of rows in a single file would be too large.
According to Microsoft Support it is 1,048,576 rows by 16,384 columns
Your reference suggests that we could split the report into multiple worksheets instead of multiple files. This would be much easier. However, I think we do the worksheet split for some other cases, too (split by label tree?). I can't recall exactly.
There is no easy fix for this. This could be solved in three different ways but all are not straight forward:
- Split the XLSX in different files: There is no concept of a single report that consists of multiple files, so this would require significant work.
- Use multiple worksheets: Worksheets are already used if the report should be split by label tree or user.
- Deny request if report would be too big: "Too big" depends on the number of annotations and on the number of annotation coordinates. A report could contain 1M point annotations or only a few thousand freehand polygon annotations, so a hard limit for the number of annotations does not really make sense. Validation of a request that checks the number of annotation coordinates would be quite slow and/or complex, I think (count the commas in the points column?).
Another idea: Change the report to contain the array of coordinates in a single cell (like the CSV report). Offer a checkbox that makes the old behavior opt-in for backwards compatibility. Communicate this to the users.