Detection rate of 3.4.x inferior compared to previous version 3.3.1

Question

Detection rate of 3.4.x inferior compared to previous version 3.3.1

Closed this issue 5 months ago · 1 comments

I'm using gImageReader (precompiled Windows version) to recognize icelandic and german text in PDF documents. After changing from 3.3.1 to 3.4.0/2 detection of icelandic special characters is by far inferior to previous version. I assume that this is most probably due to the underlying tesseract-ocr engine.
Would it be possible to change just the tesseract-ocr engine without loosing the improved user interface of the 3.4.x versions?

Answer 1 · 2024-04-08T08:02:12.000Z

Yes, but you will most likely need to recompile the application against the older libtesseract.