lquirosd/P2PaLA

[Enhancement] Page-XML extractor

EvertonTomalok opened this issue · 2 comments

To adapt the script to extract information/coordinates about page, from XML's formats most knowledge in the industry, like YOLO and PASCAL/VOC.

Observation: I could help with this task.

mrocr commented

@EvertonTomalok
Can you clarify more? what do you mean extract information/coordinates about page.

  • Are you talking about improving the model backbone?
  • Are you talking about improving the baseline to textline extractor?

To adapt the script to extract information/coordinates about page, from XML's formats most knowledge in the industry, like YOLO and PASCAL/VOC.

Observation: I could help with this task.

Hi,
YOLO and PASCAL/VOC formats are very focused on image segmentation data, while PAGE-XML is focused on Document image representation, this is not just the segmentation of the document but the relationship between objects (e.g. reading order), the data in the documents (transcription, probabilistic index, modernization, notes. ....) and of course the metadata of the document itself.

If you think that YOLO/PASCAL format converted is useful for the community, please feel free to contribute (Send a pull request).