yob/pdf-reader

Extra spaces between letters in a single word

pickhardt opened this issue · 2 comments

I noticed this gem has problems parsing some PDFs where the text is not necessarily clean.

For instance, this file: https://www.jstor.org/stable/3684663

Some parts of it get output like: "a b o u t a r e g r e s s i o n t o o r i g i n a l c h a o s"

However, it doesn't seem like it's inherently a problem with the file, because Python's PyPDF2 interprets it correctly as "about a regression to original chaos"

Do you think there is some step that this reader is missing? Or alternatively is there some option I should set when using the PDF::Reader to get it to read the pdfs better?

I too am experiencing this issue.