elastic/elasticsearch

version_conflict_engine_exception with bulk update

Closed this issue · 6 comments

Elasticsearch version:

"version" : {
    "number" : "2.1.1",
    "build_hash" : "40e2c53a6b6c2972b3d13846e450e66f4375bd71",
    "build_timestamp" : "2015-12-15T13:05:55Z",
    "build_snapshot" : false,
    "lucene_version" : "5.3.1"
  }

JVM version:

"jvm":{"pid":15324,"version":"1.7.0_07","vm_name":"Java HotSpot(TM) Client VM","vm_version":"23.3-b01","vm_vendor":"Oracle Corporation","start_time_in_millis":1458163388025,"mem":{"heap_init_in_bytes":268435456,"heap_max_in_bytes":1037959168,"non_heap_init_in_bytes":12746752,"non_heap_max_in_bytes":100663296,"direct_max_in_bytes":1037959168}

OS version:

"os":{"refresh_interval_in_millis":1000,"name":"Windows Server 2008 R2","arch":"x86","version":"6.1","available_processors":4,"allocated_processors":4},"process":{"refresh_interval_in_millis":1000,"id":15324,"mlockall":false},

Description of the problem including expected versus actual behavior:
I'm doing the document update with two bulk requests. The first request contains three updates and the second bulk request contains just one.
For the first bulk request the response is completely success but response for the second one said about version conflict.
The first request contains three updates of the document:

16:27:34.325 {ElasticSearch} 
HTTP Path: /_bulk 
HTTP POST Request: {
....
{"update": {"_index": "session-2016.03.14", "_type": "session", "_id": "3"}}
{"doc":{"states":{"state1":{"info": "some state info"}}}}

{"update": {"_index": "session-2016.03.14", "_type": "session", "_id": "3"}}
{"doc":{"states":{"state2":{"info": "some state info"}}}}


{"update": {"_index": "session-2016.03.14", "_type": "session", "_id": "3"}}
{"doc":{"states":{"state3":{"info": "some state info"}}}}
....

Then the second one which contains just one update:

16:27:34.334 {ElasticSearch} 
HTTP Path: /_bulk 
HTTP POST Request: 
{"update": {"_index": "session-2016.03.14", "_type": "session", "_id": "3"}}
{"doc":{"states":{"state4":{"info": "some state info"}}}}

And then the response for first request where all statuses are 200:

16:27:34.391 {ElasticSearch} Response from ElasticSearch localhost:9200: 
("took"=63,"errors"="false","items"=(

"JSON_ARRAY_ELEM"=("update"=("_index"="session-2016.03.14","_type"="session","_id"="3","_version"=6,"_shards"=("total"=2,"successful"=1,"failed"=0),"status"=200)),

"JSON_ARRAY_ELEM"=("update"=("_index"="session-2016.03.14","_type"="session","_id"="3","_version"=7,"_shards"=("total"=2,"successful"=1,"failed"=0),"status"=200)),

"JSON_ARRAY_ELEM"=("update"=("_index"="session-2016.03.14","_type"="session","_id"="3","_version"=8,"_shards"=("total"=2,"successful"=1,"failed"=0),"status"=200))))

And response for the second request with status 409:

16:27:34.391 {ElasticSearch} Response from ElasticSearch localhost:9200: 
("took"=25,"errors"="true","items"=(
...
"JSON_ARRAY_ELEM"=("update"=("_index"="session-2016.03.14","_type"="session","_id"="3","status"=409,"error"=("type"="version_conflict_engine_exception","reason"="[session][3]: version conflict, current [6], provided [5]","index"="session-2016.03.14","shard"="1"))),
....

Steps to reproduce:
There is no some especial steps for reproduce, and I've observed it just once.

Additional info:

"gc_collectors":["Copy","MarkSweepCompact"],"memory_pools":["Code Cache","Eden Space","Survivor Space","Tenured Gen","Perm Gen"]},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s","queue_size":-1},"index":{"type":"fixed","min":4,"max":4,"queue_size":200},"fetch_shard_store":{"type":"scaling","min":1,"max":8,"keep_alive":"5m","queue_size":-1},"get":{"type":"fixed","min":4,"max":4,"queue_size":1000},"snapshot":{"type":"scaling","min":1,"max":2,"keep_alive":"5m","queue_size":-1},"force_merge":{"type":"fixed","min":1,"max":1,"queue_size":-1},"suggest":{"type":"fixed","min":4,"max":4,"queue_size":1000},"bulk":{"type":"fixed","min":4,"max":4,"queue_size":50},"warmer":{"type":"scaling","min":1,"max":2,"keep_alive":"5m","queue_size":-1},"flush":{"type":"scaling","min":1,"max":2,"keep_alive":"5m","queue_size":-1},"search":{"type":"fixed","min":7,"max":7,"queue_size":1000},"fetch_shard_started":{"type":"scaling","min":1,"max":8,"keep_alive":"5m","queue_size":-1},"listener":{"type":"fixed","min":2,"max":2,"queue_size":-1},"percolate":{"type":"fixed","min":4,"max":4,"queue_size":1000},"refresh":{"type":"scaling","min":1,"max":2,"keep_alive":"5m","queue_size":-1},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m","queue_size":-1}},..."max_content_length_in_bytes":104857600},"plugins":[]}}}

@atm028 Your second update request happened at the same time as another request, so between fetching the document, updating it, and reindexing it, another request made an update.

See the retry_on_conflict parameter in the docs: https://www.elastic.co/guide/en/elasticsearch/reference/2.2/docs-update.html#_parameters_3

@clintongormley But single client and single Elasticsearch node has been used and client sent both requests in range of single connection(http 1.1 with keep-alived connection). Where the another process comes from? Or it means that each request handling in own thread? Even from the same connection.

If you send a request and wait for the response before sending the next request, then they will be executed serially. But I think you've sent more requests than you realise, eg looking at the error message:

version conflict, current [6], provided [5]

...you've made more than one update to that document

That's true, the second update request has been sent before the first one has been done. But if the requests has been sent in single connection then updates to the document should be enrolled sequentially. And then two responses will be send to the client. Of course if the handling of them works in single thread, since it single connection. At least in code the same thread context used for dispatching request. Doesn't it?

No. Requests are handled asynchronously.

@clintongormley ok, thank you, now the reason is clear