Large meta.error-description field fails Elasticsearch metric store ingest
gbanasiak opened this issue · 0 comments
Rally version (get with esrally --version
):
esrally 2.10.0.dev0 (git revision: a2c09a751d7e2797fde531f90aea43ac1375c987)
Description of the problem including expected versus actual behavior:
In certain scenarios, Rally can produce large meta.error-description
field in rally-metrics-*
documents which cannot be indexed by Elasticsearch and fails a race. The meta.error-description
field is mapped as keyword
which has a term byte-length limit of 32766 bytes imposed by Lucene.
The characteristic symptom is the following error:
Document contains at least one immense term in field="meta.error-description" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.
Provide logs (if relevant):
2024-02-27 20:37:56,153 ActorAddr-(T|:37759)/PID:484 esrally.driver.runner WARNING Bulk request failed: [HTTP status: 409, message: [-ChI7I0BiZHehfRA873C]: version conflict, document already exists (current version [1]) | HTTP status: 409, message: [-ChI7I0BiZHehfRA877C]: version conflict, document already exists (current version [1]) | HTTP status: 409, message: [-ChI7I0BiZHehfRA87_C]: version conflict, document already exists (current version [1]) | HTTP status: 409, message: [-ChI7I0BiZHehfRA87nB]: version conflict, document already exists (current version [1]) | [..]
024-02-27 20:38:24,838 ActorAddr-(T|:33739)/PID:32157 esrally.metrics ERROR Unretryable error encountered when sending metrics to remote metrics store: [document_parsing_exception] - Full error(s) [[{'index': {'_index': 'rally-metrics-2024-02', '_id': '1BZK7I0BHqD26mvHOsiZ', 'status': 400, 'error': {'type': 'document_parsing_exception', 'reason': "[1:1166] failed to parse field [meta.error-description] of type [keyword] in document with id '1BZK7I0BHqD26mvHOsiZ'. Preview of field's value: 'HTTP status: 409, message: [-ChI7I0BiZHehfRA873C]: version conflict, document already exists (current version [1]) | HTTP status: 409, message: [-ChI7I0BiZHehfRA877C]: version conflict, document already exists (current version [1]) | HTTP status: 409, message: [-ChI7I0BiZHehfRA87_C]: version conflict, document already exists (current version [1]) | HTTP status: 409, message: [-ChI7I0BiZHehfRA87nB]: version conflict, document already exists (current version [1]) | ... ", 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'Document contains at least one immense term in field="meta.error-description" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: \'[72, 84, 84, 80, 32, 115, 116, 97, 116, 117, 115, 58, 32, 52, 48, 57, 44, 32, 109, 101, 115, 115, 97, 103, 101, 58, 32, 91, 45, 67]...\''}}, [..]
2024-02-27 20:38:25,742 -not-actor-/PID:32059 esrally.racecontrol ERROR A benchmark failure has occurred
2024-02-27 20:38:25,742 -not-actor-/PID:32059 esrally.racecontrol INFO Telling benchmark actor to exit.
2024-02-27 20:38:25,743 -not-actor-/PID:32059 esrally.rally ERROR Cannot run subcommand [race].
Traceback (most recent call last):
File "/home/esbench/rally/esrally/rally.py", line 1184, in dispatch_sub_command
race(cfg, args.kill_running_processes)
File "/home/esbench/rally/esrally/rally.py", line 932, in race
with_actor_system(racecontrol.run, cfg)
File "/home/esbench/rally/esrally/rally.py", line 962, in with_actor_system
runnable(cfg)
File "/home/esbench/rally/esrally/racecontrol.py", line 408, in run
raise e
File "/home/esbench/rally/esrally/racecontrol.py", line 405, in run
pipeline(cfg)
File "/home/esbench/rally/esrally/racecontrol.py", line 74, in __call__
self.target(cfg)
File "/home/esbench/rally/esrally/racecontrol.py", line 344, in benchmark_only
return race(cfg, external=True)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/esbench/rally/esrally/racecontrol.py", line 302, in race
raise exceptions.RallyError(result.message, result.cause)
esrally.exceptions.RallyError: Traceback (most recent call last):
File "/home/esbench/rally/esrally/metrics.py", line 106, in guarded
return target(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/esbench/.local/lib/python3.11/site-packages/elasticsearch/helpers/actions.py", line 524, in bulk
for ok, item in streaming_bulk(
File "/home/esbench/.local/lib/python3.11/site-packages/elasticsearch/helpers/actions.py", line 438, in streaming_bulk
for data, (ok, info) in zip(
File "/home/esbench/.local/lib/python3.11/site-packages/elasticsearch/helpers/actions.py", line 355, in _process_bulk_chunk
yield from gen
File "/home/esbench/.local/lib/python3.11/site-packages/elasticsearch/helpers/actions.py", line 274, in _process_bulk_chunk_success
raise BulkIndexError(f"{len(errors)} document(s) failed to index.", errors)
elasticsearch.helpers.BulkIndexError: 18 document(s) failed to index.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/esbench/rally/esrally/actor.py", line 92, in guard
return f(self, msg, sender)
^^^^^^^^^^^^^^^^^^^^
File "/home/esbench/rally/esrally/driver/driver.py", line 306, in receiveMsg_WakeupMessage
self.driver.post_process_samples()
File "/home/esbench/rally/esrally/driver/driver.py", line 1007, in post_process_samples
self.sample_post_processor(raw_samples)
File "/home/esbench/rally/esrally/driver/driver.py", line 1120, in __call__
self.metrics_store.flush(refresh=False)
File "/home/esbench/rally/esrally/metrics.py", line 930, in flush
self._client.bulk_index(index=self._index, items=self._docs)
File "/home/esbench/rally/esrally/metrics.py", line 81, in bulk_index
self.guarded(elasticsearch.helpers.bulk, self._client, items, index=index, chunk_size=5000)
File "/home/esbench/rally/esrally/metrics.py", line 170, in guarded
raise exceptions.RallyError(msg)
esrally.exceptions.RallyError: Unretryable error encountered when sending metrics to remote metrics store: [document_parsing_exception]