/elastic-gdpr-scanner

Scan Elasticsearch instances to check for GDPR compliance

Primary LanguagePythonApache License 2.0Apache-2.0

elastic-gdpr-scanner

Scan Elasticsearch instances to check for GDPR compliance.

Disclamer: Vincent Maury or Elastic cannot be held responsible for the use of this script! Use it at your own risk

Getting Started

These instructions will get you a copy of the project up and running on your local machine.

Running the scanner

Prerequisites

This piece of python has no other pre-requisite than Python 3. It should work on any platform (tested on Windows so far). No need for additional library.

Running the script

Just clone this repository and run the script.

git clone https://github.com/blookot/elastic-gdpr-scanner
python elastic-gdpr-scanner.py -h

The script has several options:

  • -h will display help.
  • --target TARGET to enter a specific target (hostname or single IP or IP range in CIDR format, eg 10.50.3.0/24). Defaults to localhost.
  • --port PORT to specify the port where Elasticsearch is running. Defaults to 9200.
  • --user USER to set a username to use when trying to authenticate to Elasticsearch (default: elastic)')
  • --password PASSWORD to set a password to use when trying to authenticate to Elasticsearch (default: changeme)')
  • --regex REGEX to add a specific regular expression to look for in the documents, like your username. Default list of regexes provided in the script.
  • --nb-threads NB_THREADS to specify how many hosts you want to scan in parallel. Defaults to 10.
  • --socket-timeout TIMEOUT to set the timeout for socket connect (open port testing), in seconds. Set it to 2 on the Internet, 0.5 in local networks. Defaults to 2.
  • --run-scan to run the search for GDPR data (based on regex matching). Defaults to False (only inventory Elasticsearch instances without going into indices and running the regex matching).
  • --out OUT to specify the name of the log file to output results. Defaults to es-gdpr-report.csv
  • --verbose turns on verbose output in console. Defaults to False.

Report

You can grab the es-gdpr-report.csv file generated by this script, as a report. Column titles should be self-explanatory. This file has a CSV format, because I love Microsoft Excel :-)

Note: this script only scans non-internal indices (the ones not starting with .) so the sum of non-internal indices do not equal the totals of each node.

Authors

  • Vincent Maury - Initial commit - blookot

License

This project is licensed under the Apache 2.0 License - see the LICENSE.md file for details

Acknowledgments

  • Check this website for further regexes
  • Found this repo which is a great initiative, but sadly empty...
  • Inspired by my old vulnerability scanner startup... :-)