hOCR highlight parser Basic parsing for specific needs from an hOCR html doc. To use, simply call ruby parse_highlights.rb input.html > output.txt Sample input.html included.