page-to-text Extracts the text from a PAGE file and writes it to stdout. Use like: python page_to_text.py <page-xml-file>