IOC Parser is a tool to extract indicators of compromise from security reports in PDF format. A good collection of APT related reports with many IOCs can be found here: APTNotes.
iocp [-h] [-p INI] [-i FORMAT] [-o FORMAT] [-d] [-l LIB] FILE
- FILE File/directory path to report(s)
- -p INI Pattern file
- -i FORMAT Input format (pdf/txt/html)
- -o FORMAT Output format (csv/json/yara)
- -d Deduplicate matches
- -l LIB Parsing library
Sample usages
python ioc-parser -i txt -o json -d test/sample.txt
python ioc-parser -i pdf -l pypdf2 -o json -d test/sample.pdf
python ioc-parser -i html -o json -d test/sample.html
pip install -r requirements
One of the following PDF parsing libraries:
For HTML parsing support:
- BeautifulSoup - pip install beautifulsoup4
For HTTP(S) support:
- requests - pip install requests