sat-utils/sat-api

logging batch ingest errors

Opened this issue · 0 comments

When one or more Items fails ingestion into Elasticsearch during a bulk write, the returned error message is long and unhelpful as it is a big string with info from all the 'docs' that were in the batch, here's a sample:

2019-01-22T20:51:11.320Z - error: _index=items, _type=doc, _id=LC80230012015141LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3705, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015125LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3690, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015109LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3789, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015093LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3706, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015077LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3691, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102016096LGN01, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3600, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102015253LGN02, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3601, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102014106LGN01, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3604, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102014234LGN01, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3602, _primary_term=1, status=201, 

Therefore one record starts with "_index" and ends with "status". If they were successful they will have "failed=0"

However, a failed record actually looks different:

 _index=items, _type=doc, _id=LC80231212013343LGN00, status=400, type=mapper_parsing_exception, reason=failed to parse [geometry], type=parse_exception, reason=invalid number of points in LinearRing (found [1] - must be >= [4]),

It doesn't contain "successful", "failed" or other terms and instead has status=400 and a "reason" which is the actual error.

We don't want to log the entire string, instead we want to log each item that failed in the batch, if any.

We just need to log the "_id" field and the "reason" field for the error.