cannot identify image file using pdf2image.convert_from_bytes
alistairwgillespie opened this issue · 1 comments
alistairwgillespie commented
Hi,
I'm using AWS Lambda to run pipelines that consume PDF documents.
When attempting to optimize memory allocation forpdf2image.convert_from_bytes
using context management and an output_folder
, I get the following error:
`cannot identify image file '/tmp/tmprz6rwu8a/a606ca84-e027-4d88-88aa-6d25099a9776-18.ppm'
My code looks like so:
pil_images=None
images=None
with tempfile.TemporaryDirectory() as tmpdir:
pil_images = pdf2image.convert_from_bytes(
document_bytes,
dpi=dpi,
output_folder=tmpdir
)
pil_images = [rsz(i, resize) for i in pil_images]
images = [image_to_bytes(i, fmt) for i in pil_images]
...
Any help is much appreciated.
Belval commented
Does this happen with a specific PDF file?