Improve error message when sparkey hits array-size limits
kellen opened this issue · 0 comments
kellen commented
When using saveAsSparkey
, if any shard is > ~2gb then you will get a coder exception and something like
Error message from worker: org.apache.beam.sdk.util.UserCodeException: java.lang.OutOfMemoryError: Required array length 2147483639 + 15534 is too large
which is not easily interpretable.
See if we can preemptively capture serialized sizes so that we can issue a better error message like "Increase number of sparkey shards"