/PDFparser

Primary LanguagePython

PDF Parser Tools

🔍 PDFparser is a basic Python script that allow you to Download, Find and Parse PDF files! 🔧


USAGE:

Clone the repository:

git clone https://github.com/Grogny/PDFparser

Open the repository:

cd PDFparser

Install the requirements:

pip install -r requirements.txt

Run the script:

To download PDF file:

python3 pdfparser.py -d [PDF_URL]

or

python3 pdfparser.py --download [PDF_URL]

To find if a website contain PDF files:

python3 pdfparser.py -f [WEBSITE_URL]

or

python3 pdfparser.py --find [WEBSITE_URL]

To extract the PDF informations:

python3 pdfparser.py -e [PDF_NAME]

or

python3 pdfparser.py --extract [PDF_NAME]

❗ I only tried the script on a Debian terminal, I don't know if it work on Windows. ❗


EXAMPLES AND SCREENSHOTS:

Downloader example:

Finder example:

Extractor example:

Note:

All the unamed PDF files will be rename "Untitled.pdf".
⚠️ I do not own the PDF and the Website used for the example. ⚠️


MODULES I USED:


DISCLAIMER:

I am not responsible for what you do and the information you retrieve with PDFparser.