Overview

People think they are getting smarter by using passphrases. Let's prove them wrong!

This project includes a massive wordlist of phrases (17,737,982) and two hashcat rule files for GPU-based cracking.

The 'passphrases.txt' file is stored in Git Large File Storage (GLFS), so download via this link or use git if you known what you're doing with GLFS.

Use both rules for best results.

Here is an example for NTLMv2 hashes:

hashcat64.bin -a 0 -m 5600 hashes.txt passphrases.txt -r rule1.hashcat -r rule2.hashcat -O -w 2

Sources Used

So far, I've scraped the following:

15,000 Useful Phrases
Urban Dictionary dataset pulled Dec 09 2017 using this great script.
Song lyrics for Rolling Stone's "top 100" artists using my lyric scraping tool.
Movie titles and lines from this Cornell project.
"Titles" from the IMDB dataset on Kaggle.
Global POI dataset using the 'allCountries' file.
Quotables dataset on Kaggle.
MemeTracker dataset from Kaggle.
Wikipedia Article Titles dataset from Kaggle.
1,800 English Phrases
2016 US Presidential Debates dataset on Kaggle.

Cleaning sources

Check out the script cleanup.py to see how I've cleaned the raw sources. You can find the pre-cleaned data here.

Hashcat Rules

Given the phrase take the red pill the first hashcat rule will output the following

take the red pill
take-the-red-pill
take.the.red.pill
take,the,red,pill
take_the_red_pill
taketheredpill
Take the red pill
TAKE THE RED PILL
tAKE THE RED PILL
Taketheredpill
tAKETHEREDPILL
TAKETHEREDPILL
Take The Red Pill
TakeTheRedPill
Take-The-Red-Pill
Take.The.Red.Pill
Take,The,Red,Pill
Take_The_Red_Pill

Adding in the second hashcat rule makes things get a bit more interesting. That will return a huge list per candidate. Here are a couple examples:

T@k3Th3R3dPill!
T@ke-The-Red-Pill
taketheredpill2020!
T0KE THE RED PILL (unintentional humor)

Enjoy!

W9HAX/passphrase-wordlist

Overview

Sources Used

Cleaning sources

Hashcat Rules