Belval/pdf2image

Error: /ioerror in --image--

orbanbalage opened this issue · 2 comments

Describe the bug
pdf2image errors out instead of completing process (some PDFs work, some don't)

➜  Downloads pdf2image output.pdf
Page-1
Page-2
Error: /ioerror in --image--
Operand stack:

Execution stack:
   %interp_exit   .runexec2   --nostringval--   image   --nostringval--   2   %stopped_push   --nostringval--   image   image   false   1   %stopped_push   1990   1   3   %oparray_pop   1989   1   3   %oparray_pop   1977   1   3   %oparray_pop   1833   1   3   %oparray_pop   --nostringval--   %errorexec_pop   .runexec2   --nostringval--   image   --nostringval--   2   %stopped_push   --nostringval--   image   1864   1   7   %oparray_pop
Dictionary stack:
   --dict:734/1123(ro)(G)--   --dict:0/20(G)--   --dict:76/200(L)--   --dict:65/75(L)--   --dict:18/25(L)--   --dict:0/15(L)--   --dict:0/15(L)--
Current allocation mode is local
Last OS error: No such file or directory
Current file position is 34815
GPL Ghostscript 9.54.0: Unrecoverable error, exit code 1
Error: Failed to launch Ghostscript!

Desktop (please complete the following information):

  • OS: macOS
pdf2image version 0.53 http://flexpaper.devaldi.com/pdf2image/, based on Xpdf version 3.02
Copyright 1999-2011 Devaldi Ltd, Gueorgui Ovtcharov and Rainer Dorsch
Copyright 1996-2007 Glyph & Cog, LLC

Can you provide a sample PDF to reproduce the issue? This seems like a poppler/ghostscript issue and not a pdf2image one. Unfortunately I can't really fix bugs in poppler as I have no visibility on the library.

Sorry, I thought I attached the file.

Indeed gs found some issues with the file, but even after fixing it the issue remains.

gs -dNOPAUSE -dBATCH -sDEVICE=nullpage output.pdf -sOutputFile=output-fix.pdf
GPL Ghostscript 9.54.0 (2021-03-30)
Copyright (C) 2021 Artifex Software, Inc.  All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
   **** Warning:  File has an invalid xref entry:  2.  Rebuilding xref table.
Processing pages 1 through 2.
(...)
   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> itext-paulo-155 (itextpdf.sf.net - lowagie.com) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

Command to fix the file:

gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output-fix.pdf output.pdf

Verify:

gs -dNOPAUSE -dBATCH -sDEVICE=nullpage output-fix.pdf
GPL Ghostscript 9.54.0 (2021-03-30)
Copyright (C) 2021 Artifex Software, Inc.  All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
Processing pages 1 through 2.
Page 1
Page 2

I wanted to check out the file with poppler, but I don't remember how I installed pdf2image, and there were some conflicts in brew, so I ended up uninstalling it, and just installing xpdf, and using:

pdfimages output.pdf output-images

which works.

Perhaps there are no images in the file at all and that is the problem? Xpdf does make images out of the pages, which is what I wanted I think.

Here is the file if you wanted to check on your end.

output.pdf