This Python script extracts images from a PDF file using pdfminer.six and Pillow libraries.
- Python 3.x
- pip (Python package installer)
-
Clone this repository or download the script files.
-
Ensure your PDF file is named
file.pdf
and is in the same directory as the script. -
Install the required packages:
pip3 install -r requirements.txt
-
Place your PDF file (named
file.pdf
) in the same directory as the script. -
Run the script:
python3 main.py
-
The extracted images will be saved in a new directory named
PDF_Images
within the same folder.
- The script will extract both standalone images and images within figures.
- Extracted images will be saved in their original format when possible.
- If you encounter any issues, ensure that your PDF file is not encrypted or password-protected.
This project is open source and available under the MIT License.