OCR-D/ocrmultieval

OcrdSegmentEvaluate: use different metric

Opened this issue · 1 comments

https://github.com/kba/ocrmultieval/blob/5de79f3021b48f83f9cb798a484fd472d21ed94b/ocrmultieval/backends/OcrdSegmentEvaluate.py#L27-L28

This retrieves only the mAP score, which is the least useful/adequate, and has only been added for comparison with similar benchmarks. The better keys would be precision | recall | pixel_precision | pixel_recall | pixel_iou | oversegmentation | undersegmentation under either by-categorycategory or by-imagepageidcategory.

kba commented

In fact I think it would be best to retain all the metrics of all the tools since it is hard to guess what users need beforehand for their particular use case. So ideally, we should discuss and decide on a common exchange format, like the PAGE-Eval-Schema (XML), a common JSON-Schema (JSON) and common naming for the columns (CSV). This should also make the same metrics of different tools comparable (e.g. dinglehopper.wer vs isriocreval.wer) and allow metrics-based selection instead of backend-based selection ("provide me with the precision for the layout detection of TextRegion by whatever tool that offers this").