[Bug] The program crashes when the OCR language is not english

Question

[Bug] The program crashes when the OCR language is not english

tyrell-wallace opened this issue a year ago · 6 comments

For the sake of organization, even though I've discovered this and #656 at the same time, I'm creating one issue ticket for each.

Basically, the program crashes whenever I try to OCR with a language that isn't English. I've tested it with Japanese, Chinese, German and Spanish, but I assume that might be the case for all languages.

This is the error that pops up:

"
Tesseract has aborted
Tesseract (the OCR engine) has aborted while recognizing text. This may occur if the used traineddatas are corrupt or incomplatible with the version of tesseract in use, or due to a bug in tesseract.

The stacktrace below may provide additional information about where the crash occurred.

Your work has been saved under C:\Users\User\glmageReader_crash-save.txt.
"

The generated .txt contains no data.

I'm using the x64 Portable on Windows 10 22H2.

Answer 1 · 2023-11-28T23:11:41.000Z

It is working fine with hindi and sanskrit language so far. I have no issues with it.

Answer 2 · 2024-01-12T19:48:45.000Z

@mxav1111 hi can I connect with you? I have some questions regarding this.

Answer 3 · 2024-01-14T17:50:55.000Z

Not sure how can i help but feel free.

Answer 4 · 2024-01-14T17:56:18.000Z

Ah sorry, i was not able to configure the program back then, I had a question that can we find a way to export cropped images from the app? Like if I run any PSM mode, i want the respective results to be cropped (line/ word) and saved as images.

Answer 5 · 2024-01-14T18:21:22.000Z

Sorry. No idea about it. It seems that it would allow exporting as pdf (with jpeg inside from text.

…

On Sun, Jan 14, 2024, 9:56 AM Kishlay Kisu ***@***.***> wrote: Ah sorry, i was not able to configure the program back then, I had a question that can we find a way to export cropped images from the app? Like if I run any PSM mode, i want the respective results to be cropped (line/ word) and saved as images. — Reply to this email directly, view it on GitHub <#657 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQJ5BXMGJ2E2ZAJRHG7MTJLYOQL43AVCNFSM6AAAAAA6UL56MKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJRGAYTQMZZGI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 6 · 2024-02-05T09:42:57.000Z

Cannot reproduce with 3.4.2, please retest and reopen if still relevant. Also, in the event of a tesseract crash, there is not much I can do on gImageReader side, the issue would need to be reported to tesseract upstream.