DumpDB imports credential dumps into a database to improve search performance.
There are two types of databases that will be created; one type stores the breach sources and the other type stores the dumped records. There should be a single sources-type database which stores where each record comes from (e.g. it could come from adobe2013 or collection1). There will be one or more databases which store the dumped records, these will be indexed and searched seperately.
This project requires Go version 1.15 or later. You will also need access to a MariaDB (recommended) or MySQL server.
go get -u github.com/darkmattermatt/dumpdb
Initialise the databases
go run github.com/darkmattermatt/dumpdb init -c "user:pass@tcp(127.0.0.1:3306)" -s sources -d adobe2013,collection1
Import the dumped data
go run github.com/darkmattermatt/dumpdb import -c "user:pass@tcp(127.0.0.1:3306)" -s sources -d adobe2013 -p adobe /path/to/data.tar.gz /more/data.txt
go run github.com/darkmattermatt/dumpdb import -c "user:pass@tcp(127.0.0.1:3306)" -s sources -d collection1 -p collections /path/to/data.tar.gz /more/data.txt
Search the indexed data
go run github.com/darkmattermatt/dumpdb search -c "user:pass@tcp(127.0.0.1:3306)" -s sources -d adobe2013,collection1 -Q "email LIKE '%@example.com' LIMIT 10"
Verbosity:
Output levels are as follows:
FATAL
: Only show errors and search resultsRESULT
: Only show errors and search resultsWARNING
: Nonfatal errors (usually occurring in one of the query threads)INFO
: The default level, provides minimal information at each step of the processVERBOSE
: Tells you what's going onDEBUG
: Spews out data
Global Parameters:
config=''
: Config filev=3
: Verbosity. Set this flag multiple times for more verbosityq=0
: Quiet. This is subtracted from the verbosity
Initialise a database for importing.
Parameters:
databases+
: One or more positional arguments of databases to initialisedatabases=""
: Comma separated list of databases to initialiseconn=
: connection string for the MySQL. Likeuser:pass@tcp(127.0.0.1:3306)
sourcesDatabase=""
: Initialise the following database as the one to store sources inengine="Aria"
: The database engine. Aria is recommended (requires MariaDB), MyISAM is supported for MySQLindexes="email_rev"
: Comma separated list of columns to index in the main database. Email_rev is strongly recommended to enable searching by @email.com
Process files or folders into a regularised tab-delimited text file.
Parameters:
filesOrFolders+
: One or more positional arguments of files and/or folders to importparser=
: The custom line parser to use. Modify the internal/parseline package to add another line parserbatchSize=4e6
: Number of lines per output file. 1e6 = ~64MB, 16e6 = ~1GBfilePrefix="[currentTime]_"
: Temporary processed file prefix
.tar.gz
,.tgz
: Decompress and open tarball, process each file.txt
,.csv
: Createbufio.Scanner
bufio.Scanner
: Process each line
Import files or folders into a database.
Parameters:
filesOrFolders+
: One or more positional arguments of files and/or folders to importparser=
: The custom line parser to use. Modify the internal/parseline package to add another line parserconn=
: Connection string for the SQL database. Likeuser:pass@tcp(127.0.0.1:3306)
database=
: Database name to import intosourcesDatabase=
: Database name to store sources incompress=false
: Pack the database into a compressed, read-only format. Requires the Aria or MyISAM database enginebatchSize=4e6
: Number of results per temporary file (used for the LOAD FILE INTO command). 1e6 = ~64MB, 16e6 = ~1GBfilePrefix="[database]_"
: Temporary processed file prefix
Notes:
- By default, only the
mysql
user is able to read/write to the database file directly. A workaround is to rungo build .
and thensudo -u mysql ./dumpdb import ...
- Only files with whitelisted file extensions are processed (to avoid trying to import a binary file as a text file). Currently supported extensions are
.tar.gz
,.tgz
,.txt
,.csv
.
Search multiple dump databases simultaneously.
Parameters:
query=""
: The WHERE clause of a SQL query. Yes it's injected, so try not to break your own databasecolumns="all"
: Comma separated list of columns to retrieve from the databaseconn=
: Connection string to connect to MySQL databases. Likeuser:pass@tcp(127.0.0.1:3306)
databases=
: Comma separated list of databases to searchsourcesDatabase=""
: Database name to resolve sourceIDs to their names from
Notes:
- The query is injected into the SQL command which means that any
LIMIT
statements are applied per database
This project makes use of several excellent open-source libraries, listed below: