Zero OCR'ed files
Closed this issue · 4 comments
File: D:\Google_drive_sola\Sola\2022-2023\ROP - Reologija polimerov\RLP - Reologija polimerov.pdf
[2023-01-14 19:20:35.717707] [DEBUG] Tesseract can 'textonly_pdf': True
[2023-01-14 19:20:35.733704] [DEBUG] Tesseract version: 5
[2023-01-14 19:20:35.736704] [DEBUG] cuneiform not available
[2023-01-14 19:20:35.781705] [DEBUG] Pdftoppm version: 22.12.0
[2023-01-14 19:20:35.811712] [DEBUG] Qpdf version: 11.2.0
[2023-01-14 19:20:35.811712] [DEBUG] Temp dir is C:\Users\ADMINI~1\AppData\Local\Temp\pdf2pdfocr_L3VRF
[2023-01-14 19:20:35.811712] [DEBUG] Prefix is L3VRF
[2023-01-14 19:20:35.811712] [DEBUG] Script dir is c:\Users\Administrator\anaconda3\Scripts
[2023-01-14 19:20:35.812712] [DEBUG] Parallel operations will use 20 CPUs
[2023-01-14 19:20:35.861715] [LOG] Welcome to pdf2pdfocr version 1.12.0 marapurense - https://github.com/LeoFCardoso/pdf2pdfocr
[2023-01-14 19:20:35.903716] [LOG] Input file D:\Google_drive_sola\Sola\2022-2023\ROP - Reologija polimerov\RLP - Reologija polimerov.pdf: type is application/pdf
[2023-01-14 19:20:35.918716] [DEBUG] User conversion params: best
[2023-01-14 19:20:35.918716] [DEBUG] Output file: D:\Google_drive_sola\Sola\2022-2023\ROP - Reologija polimerov\RLP - Reologija polimerov-OCR.pdf for PDF and D:\Google_drive_sola\Sola\2022-2023\ROP - Reologija polimerov\RLP - Reologija polimerov-OCR.pdf.txt for TXT
[2023-01-14 19:20:35.918716] [LOG] Converting input file to images...
[2023-01-14 19:20:43.633767] [LOG] Checking blank pages
C:\Users\Administrator\anaconda3\lib\site-packages\PIL\Image.py:3074: DecompressionBombWarning: Image size (105023996 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack.
warnings.warn(
[2023-01-14 19:20:44.652767] [LOG] Starting OCR with tesseract...
[2023-01-14 19:20:45.154768] [LOG] OCR completed
[2023-01-14 19:20:45.155767] [DEBUG] We have 0 ocr'ed files
Error: No PDF files generated after OCR. This is not expected. Aborting.
Can you please share input file?
Just out of curiosity, the installation is ok?
PDF is output from the notetaking app Inkodo, from the Microsoft store.
Hello @PatrikHlebecStor.
Your PDF worked with me. :(
Please try to add "-r 200" in command line (this decrease image resolution and must solve DecompressionBombWarning).
Others PDF files can be OCRed in your installation?
Closing due to inactivity