/Cyber_Crime_Science_Project

Python scripts used to study public passwords leaks for project at TuDelft

Primary LanguagePython

Cyber Crime Science Project

Contains the collection of files used to analyze data for the Cyber Crime Science project at Tu Delft.

These scripts were used to analyze the file fortinet-2021.txt. The file can be found in the Seclist github repository.

All scripts in this repository present a brief description of their purpose. The main pipeline followed to analyze the dataset consists in:

  • psw_extrac.py: Extracts passwords from the fortinet-2021.txt file into an output file
  • scan.py: Creates a folder where all passwords from an input file are divided into more files based on the sequences recognized by zxcvbn
  • addToDict_txt.py: Takes as input two folders, one containing well known dictionaries (i.e. RockYou.txt, ...) and the folder of passwords in sequences from scan.py, and replaces passwords in "dict.txt" if found in the known dictionaries.
  • rate_psw.py: counts the lines contained in each password file and assigns the sore given by zxcvbn, then outputs all to a file

Use cleanFiles.py if you want to limit you research to files with unique passwords