People think they are getting smarter by using passphrases. Let's prove them wrong!
This project includes a massive wordlist of phrases (17,737,982) and two hashcat rule files for GPU-based cracking.
The 'passphrases.txt' file is stored in Git Large File Storage (GLFS), so download via this link or use git if you known what you're doing with GLFS.
Use both rules for best results.
Here is an example for NTLMv2 hashes:
hashcat64.bin -a 0 -m 5600 hashes.txt passphrases.txt -r rule1.hashcat -r rule2.hashcat -O -w 2
So far, I've scraped the following:
- 15,000 Useful Phrases
- Urban Dictionary dataset pulled Dec 09 2017 using this great script.
- Song lyrics for Rolling Stone's "top 100" artists using my lyric scraping tool.
- Movie titles and lines from this Cornell project.
- "Titles" from the IMDB dataset on Kaggle.
- Global POI dataset using the 'allCountries' file.
- Quotables dataset on Kaggle.
- MemeTracker dataset from Kaggle.
- Wikipedia Article Titles dataset from Kaggle.
- 1,800 English Phrases
- 2016 US Presidential Debates dataset on Kaggle.
Check out the script cleanup.py to see how I've cleaned the raw sources. You can find the pre-cleaned data here.
Given the phrase take the red pill
the first hashcat rule will output the following
take the red pill
take-the-red-pill
take.the.red.pill
take,the,red,pill
take_the_red_pill
taketheredpill
Take the red pill
TAKE THE RED PILL
tAKE THE RED PILL
Taketheredpill
tAKETHEREDPILL
TAKETHEREDPILL
Take The Red Pill
TakeTheRedPill
Take-The-Red-Pill
Take.The.Red.Pill
Take,The,Red,Pill
Take_The_Red_Pill
Adding in the second hashcat rule makes things get a bit more interesting. That will return a huge list per candidate. Here are a couple examples:
T@k3Th3R3dPill!
T@ke-The-Red-Pill
taketheredpill2020!
T0KE THE RED PILL (unintentional humor)
Enjoy!