manisandro/gImageReader

[Bug] The program crashes when the OCR language is not english

tyrell-wallace opened this issue · 6 comments

For the sake of organization, even though I've discovered this and #656 at the same time, I'm creating one issue ticket for each.

Basically, the program crashes whenever I try to OCR with a language that isn't English. I've tested it with Japanese, Chinese, German and Spanish, but I assume that might be the case for all languages.

This is the error that pops up:

"
Tesseract has aborted
Tesseract (the OCR engine) has aborted while recognizing text. This may occur if the used traineddatas are corrupt or incomplatible with the version of tesseract in use, or due to a bug in tesseract.

The stacktrace below may provide additional information about where the crash occurred.

Your work has been saved under C:\Users\User\glmageReader_crash-save.txt.
"

The generated .txt contains no data.

I'm using the x64 Portable on Windows 10 22H2.

It is working fine with hindi and sanskrit language so far. I have no issues with it.

@mxav1111 hi can I connect with you? I have some questions regarding this.

Not sure how can i help but feel free.

Ah sorry, i was not able to configure the program back then, I had a question that can we find a way to export cropped images from the app? Like if I run any PSM mode, i want the respective results to be cropped (line/ word) and saved as images.

Cannot reproduce with 3.4.2, please retest and reopen if still relevant. Also, in the event of a tesseract crash, there is not much I can do on gImageReader side, the issue would need to be reported to tesseract upstream.