/dtabf-to-page

Create PAGE-XML Ground Truth from DTABf TEI

dtabf-to-page

Create PAGE-XML Ground Truth from DTABf TEI

Goal

  • Input 1: TEI according to DTABf guidelines
  • Input 2: Page-wise PAGE XML with coordinates
  • Output: Page-wise PAGE-XML enhanced with the rich information from TEI including
    • text
    • font info
    • region classes

Prior Work

https://github.com/jbaiter/archiscribe