/CVE-2021-40444-URL-Extractor

Python script to extract embedded URLs from doc files (.docx, .docm, .rtf)

Primary LanguagePython

CVE-2021-40444-URL-Extractor

Python script to extract embedded URLs from doc files (.docx, .docm, .rtf) by parsing document.xml.rels and defang them using defang. Uses python-magic for file type identification

Install

git clone https://github.com/gh0stxplt/CVE-2021-40444-URL-Extractor.git

cd CVE-2021-40444-URL-Extractor

pip install -r requirements.txt

Usage

python3 url-extract.py document.docx

Output

-------------------------
     Defanged URLs:
-------------------------
hXXps[:]//github[.]com/gh0stxplt/CVE-2021-40444-URL-Extractor
hXXp[:]//www.notspooky[.]com/234
hXXps[:]//www.spookyguy[.]com/123