LeoFCardoso/pdf2pdfocr

PIL.Image.DecompressionBombError: Image size (235978454 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.

Closed this issue · 2 comments

While applying OCR to a PDF, using the docker image of the repo "leofcardoso/pdf2pdfocr:latest", this error occurred:

[2023-09-05 10:35:58.939733] [LOG] Welcome to pdf2pdfocr version 1.12.0 marapurense - https://github.com/LeoFCardoso/pdf2pdfocr
[2023-09-05 10:35:58.959460] [LOG] Input file /home/docker/Dummy_IS.pdf: type is application/pdf
[2023-09-05 10:35:59.047502] [LOG] Converting input file to images...
[2023-09-05 10:36:38.577186] [LOG] Checking blank pages
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.10/multiprocessing/pool.py", line 51, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/usr/local/bin/pdf2pdfocr.py", line 249, in do_check_img_colors_size
im = Image.open(param_image_file)
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3172, in open
im = _open_core(fp, filename, prefix, formats)
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3159, in _open_core
_decompression_bomb_check(im.size)
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3068, in _decompression_bomb_check
raise DecompressionBombError(
PIL.Image.DecompressionBombError: Image size (235978454 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/bin/pdf2pdfocr.py", line 1530, in
pdf2ocr.ocr()
File "/usr/local/bin/pdf2pdfocr.py", line 712, in ocr
self.check_blank_pages(image_file_list)
File "/usr/local/bin/pdf2pdfocr.py", line 1010, in check_blank_pages
blank_map_values = colors_size_pool_map.get()
File "/usr/lib/python3.10/multiprocessing/pool.py", line 774, in get
raise self._value
PIL.Image.DecompressionBombError: Image size (235978454 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.

Hi, thank you for the post.
Can you please share your source file?
This bug may be avoided trying lower resolution in images. Please try "-r 200" flag and lets see what happens.

Hi, thank you for the post. Can you please share your source file? This bug may be avoided trying lower resolution in images. Please try "-r 200" flag and lets see what happens.

Yes, Great!
"-r 200" is working
Thank you so much for your quick response.