logging batch ingest errors
Opened this issue · 0 comments
matthewhanson commented
When one or more Items fails ingestion into Elasticsearch during a bulk write, the returned error message is long and unhelpful as it is a big string with info from all the 'docs' that were in the batch, here's a sample:
2019-01-22T20:51:11.320Z - error: _index=items, _type=doc, _id=LC80230012015141LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3705, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015125LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3690, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015109LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3789, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015093LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3706, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015077LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3691, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102016096LGN01, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3600, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102015253LGN02, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3601, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102014106LGN01, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3604, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102014234LGN01, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3602, _primary_term=1, status=201,
Therefore one record starts with "_index" and ends with "status". If they were successful they will have "failed=0"
However, a failed record actually looks different:
_index=items, _type=doc, _id=LC80231212013343LGN00, status=400, type=mapper_parsing_exception, reason=failed to parse [geometry], type=parse_exception, reason=invalid number of points in LinearRing (found [1] - must be >= [4]),
It doesn't contain "successful", "failed" or other terms and instead has status=400 and a "reason" which is the actual error.
We don't want to log the entire string, instead we want to log each item that failed in the batch, if any.
We just need to log the "_id" field and the "reason" field for the error.