nearform/leaistic

Test against various common ES bad states

Opened this issue · 0 comments

temsa commented

ES cluster are sensible to a number of issues that can happen during massive reindexations:

  • cluster crashing
  • no more memory stale
  • cannot attribute a new shard
  • cannot replicate a shard
  • not enough vm.max_map_count
  • split brain issues due to a network partition, generally leading to a loss of data
  • shards that did not have time to come back from queries (e.g. 4 shards coming back from a search query instead of 5)

We should find some ways to test those states, and check how our create/update/delete operations perform in this case