blueflower is a command-line tool that looks for secrets such as private keys or passwords in a file structure. Interesting files are detected using heuristics on their names and on their content.
Unlike some forensics tools, blueflower does not search in RAM, and does not attempt to identify cryptographic keys or algorithms in binaries.
DISCLAIMER: This program is under development. It may not work as expected and it may destroy your computer. Use at your own risk.
- search in the following types of files:
text/*
MIME-typed files- PDF, DOCX, XLSX documents
- tar, ZIP archives
- bzip2, gzip compressed files/archives
- detection of
- common key and password containers (SSH id_* , Apple keychain, Java KeyStore, etc.)
- common encrypted containers (Truecrypt, PGP Disks, GnuPG files, encrypted ZIPs, etc.)
- executables (PE, ELF, with heuristical packing detection)
- other interesting files (Bitcoin wallets, PGP policies, etc.)
- hiding of secrets searched for (names, secret keys, etc.) via a hash file
blueflower is written for Python 2.7. It will not work on Python 3.x.
From the project's top directory, you can directly run
python blueflower.py directory [hashes]
where
directory
is the root of the file structure to explorehashes
is an optional file, which should be created with the scriptmakehashes.py
(see details below)
Results are written to a log file blueflower-YYYYMMDDhhmmss
in CSV format.
(Run make clean
if you wish to remove previous log files as well as
.pyc
's.)
WARNINGS:
- no limit is set on the number of files processed (
^C
to gracefully interrupt) - RAR archives nested in other archives are not supported
- there may be a lot of false positives
To install to the global packages directory:
sudo make install
(omit sudo
on Windows)
To install locally (to site.USER_BASE):
make local
(Run make cleanall
if you wish to clean up the project's directory.)
blueflower can then be called from any location, assuming that the binaries are located in a directory included in your PATH.
To build a package executable on Windows machines that don't have Python installed, do the following (on a Windows machine with Python installed):
- install py2exe
- in blueflower's
setup.py
, addimport py2exe
, remove the line
version=blueflower.__version__,
and add the line
console=['blueflower/__main__.py'],
- run
python setup.py py2exe
This will create an executable and associated files in the dist/
folder.
Make sure that the external modules are installed as
unzipped eggs rather than as .egg
files (easy_install option
--always-unzip
).
Let's say you have a list of strings that you want to search for without revealing them. These could be names, passwords, secret keys, etc.
blueflower implements this feature, by taking as optional argument a
list of hashes (-H hashesfile
).
Obviously this comes with a performance penalty: hashing all strings
matching the regular expression given.
First, put your secret strings in a text file with one item per line, for example
secret1
secret2
secret3
Then, run
python makehashes.py yourfile
which will prompt you for
- a regular expression that matches the set of strings
- a password, which will be needed to run blueflower
This will create a file yourfile.hashes
in the same directory as
yourfile
, to give as an argument to blueflower.
The first line of .hashes
files contains the regular expression.
The second line contains 2 lowercase hex strings separated by a comma
(no space). These are generated by makehashes.py
, and
are respectively
- a salt of 8 bytes (16 hex characters), generated using
os.urandom(8)
- a verifier of 8 bytes, which is the SipHash-2-2 hash of the salt using the key derived from the password
The verifier serves to ensure that the password entered is the correct one, by checking that hashing the salt using the password entered yields a value identical to the verifier.
The subsequent lines include the SipHash-2-2 hashes of each of the secret strings, in the same order as received, using the verified key.
For example, the first 5 lines of a .hashes
file can be
\b[a-z]{2,20}\b
694d63f4630c6617,d478449c1a95f8c0
06f12ba4c3b57a9f
91f5ed23a2cc21ac
381cd35a5d203fea
A 128-bit key is derived from the password using SipHash-1000-20000 (the SipHash PRF with 1000 compression rounds and 20000 finalization rounds). This should be slow enough to mitigate bruteforce attacks, and the use of a salt makes precomputation useless.
SipHash-1000-20000 was chosen rather than a dedicated password hashing scheme (bcrypt/scrypt/PBKDF2) for simplicity, to minimize dependencies, and because the GPU-friendliness of SipHash can be compensated by really slow hashing.
The hashing speed is mostly independent of the password's length, since the 20000-iteration bottleneck is the finalization.
The presence of the verifier string allows to efficiently test the correctness of a key, and thus to bruteforce keys at a high rate by computing many SipHash-2-2 in parallel. However keys are 128-bit, and thus practically unbreakable. Again, the use of a salt makes precomputation useless.
The use of a same salt for both key derivation and verifier generation might be surprising, but it does not reduce security since different hash functions are used, and the unpredictability property is not affected.
The choice of the regular expression leaks information on the secrets strings searched for, so choose it carefully: the more general, the less leak, but the slower too (more strings will be hashed).
The log file only contains the hash corresponding to the secret string detected, not the string matched. The log file thus does not directly reveal the secrets searched for. However,
- the log file contains the name of the file including the secret string
- one can easily modify blueflower to include the secrets detected in the log file
Python modules:
blueflower is copyright (c) 2014 Jean-Philippe Aumasson, and under GPLv3.
The siphash.py
module is copyright (c)
2013 Philipp Jovanovic, and under MIT license.
The drawing in the image blueflower.jpg
is copyright
(c) 2014 Melina Aumasson, and under CC BY-NC-ND
3.0.