dannnylo/rtesseract

Support hOCR output and parsing

Closed this issue · 5 comments

I'd like to add support for reporting the bounding box of each word, similar to this API: https://github.com/meh/ruby-tesseract-ocr

I'm just adding this for record keeping and plan on working on it. I didn't miss anything, this sort of functionality doesn't exist in this gem right?

Hello,
I started the development of the bounding box, but initially is by character and not by word, I have to find which configuration generates this way.
Branch: https://github.com/dannnylo/rtesseract/tree/bounding_box

brbrr commented

Is there any updates with this feature? I'm looking for solution how to find word coordinates

Hello,
I tried create this feature, It's now working on branch bounding_box.
If you have some questions feel free to ask.
I will be happy if you tell me your feedback.

brbrr commented

Hi, any api docs for that branch?
will it go in to master?

This feature is in new version of this gem.