Out of heap space - parse one bucket at a time
codingchili opened this issue · 2 comments
codingchili commented
Current implementation parses all bulk insert buckets into a massive json object which is stored on the heap. Proposed solution prepares a bucket at a time, preferably while elasticsearch is busy indexing.
To work around this issue run with the -Xmx1g parameter, or increase if required.
codingchili commented
By parsing the excel files up front it is possible to fail before the import has started.
codingchili commented
Done - we still cannot support excel files of arbitrary sizes. Apache POI consumes a LOT of memory.