danielberkompas/elasticsearch-elixir

Consistent memory usage when bulk indexing?

strzibny opened this issue · 1 comments

Hello all,

I am using elasticsearch-elixir to bulk insert a database from a CSV file. Everything works except the memory keeps growing until Beam crashes due to insufficient free memory. If I use a similar stream in Elixir without using elasticsearch-elixir I am able to go though the collection (and count something for example). I would think that that the old CSV data (already indexed) would not have to be kept in memory, but for some reason this is what is happening.

Is there a way to bulk insert half of the data and then continue (e.g. without creating a new version of the index)? Or perhaps this is bug in elasticsearch-elixir?

I don't have any insight here. The streaming behavior is provided by your Elasticsearch.Store module that you provide to the library. Perhaps something is happening there that is causing the memory to not be released.

Anyway, I'm closing this down since it's an old issue.