Warning: not fully implemented - work in progress
Python3 script for performing various operations on ALTO files.
Planned features:
- extract OCR confidence of the ALTO document(s)
- extract text content of the ALTO document(s)
- extract graphical elements of the ALTO document(s)
- extract metadata of the ALTO document(s)
- xsl transform ALTO document(s) to target format(s)
- xpath query content of the ALTO document(s)
Requirements:
- lxml for XPath and XSLT support