PRImA-Research-Lab/prima-page-converter
Command line tool to convert page layout files to the latest PAGE XML format. It supports all previous versions of the PAGE format as well as ALTO XML, FineReader XML, and HOCR
HTMLApache-2.0
Issues
- 7
No text lines in ALTO output
#13 opened by stweil - 0
Broken text converting ABBYY files
#23 opened by mikegerber - 2
- 0
does not convert to latest PAGE schema by default
#21 opened by bertsky - 8
- 0
Convert images with bounding box to PAGE
#20 opened by IKetchup - 5
- 0
- 2
- 5
Error writing target ALTO XML file
#16 opened by kba - 9
Page Converter producing messy Unicode blocks
#14 opened by novacellus - 7
PDF to Page-xml
#5 opened by mrocr - 2
HOCR to PAGE converting howto?
#1 opened by jmokoistinen - 1
NullPointerException
#12 opened by stweil - 2
- 4
negative input points
#10 opened - 4
SAXParseException: Premature end of file
#8 opened by chaddy314 - 2
Error writing target PAGE XML file
#9 opened - 0
labelImg json to page-xml
#7 opened by mrocr - 4
- 3
Djvu to Page-xml
#4 opened by mrocr - 4
- 4
"Could not save target PAGE XML file:"
#3 opened by vndee