manisandro/gImageReader

Crash (without warning) when reading certain sequences of characters

rgreen5 opened this issue · 7 comments

gImagereader 3.3.1 / Linux Mint 20.1 / Cinnamon 4.8.6

This is reproducible every time with pdf pages that contain certain sequences of characters.

To reproduce

Note: page references are to the gImageReader pages, not the page numbers at the bottom of the PDF.

  1. Go to ANNA'S ARCHIVE and donwload a pdf of "The Secret Power of Music" (David Tame, 5.8 MB).
  2. Try to scan a section of the pdf containing either page 200 or page 214.

RESULT: The app works normally until it has finished scanning the page in question, then it. closes/crashes without warning. What the pages have in common is a sequence of em/en (?) dashes connecting words.

Cannot reproduce this with my setup: Debian Linux Testing, gImageReader commit a4820e.
Tested with language en, OCR mode 'hOCR, PDF' on pages 200 and 214, 192-205, 200-215.
Tested with language en, OCR mode 'plain text' on pages 200-215.

I've slightly edited my OP to make clear that the app only crashes after scanning the page in question.

Can you please post a stack trace of the crash?

Can you please post a stack trace of the crash?

If you can supply instructions I'll give it a try.

Actually first step would be to try using the latest version 3.4.1.

Then to get a stack trace, install gdb and:

$ gdb ./gimagereader-qt5 # (or gimagereader-qt6 or gimagereader-gtk depending on which version you are using)
(gdb) run
# Trigger crash
(gdb) bt

and post the output of the gdb bt command.

gimagereader 3.4.1 / Linux Mint 20.1 / Cinnamon 4.8.6

Yes. Works fine with the latest version. Thanks.

Ok thanks.