Out of heap space - parse one bucket at a time

Question

Out of heap space - parse one bucket at a time

codingchili opened this issue 7 years ago · 2 comments

Current implementation parses all bulk insert buckets into a massive json object which is stored on the heap. Proposed solution prepares a bucket at a time, preferably while elasticsearch is busy indexing.

To work around this issue run with the -Xmx1g parameter, or increase if required.

Answer 1 · 2017-10-07T12:36:31.000Z

By parsing the excel files up front it is possible to fail before the import has started.

Answer 2 · 2018-04-28T14:15:32.000Z

Done - we still cannot support excel files of arbitrary sizes. Apache POI consumes a LOT of memory.