/Password-Strength-Meter-Accuracy

Measuring the Accuracy of Password Strength Meters

Primary LanguagePython

Measuring the Accuracy of Password Strength Meters

Password strength meters are an important tool to help users choose more secure passwords. However, strength meters can only then provide reasonable guidance when they are accurate, i.e., their score correctly reflects password strength. A strength meter with low accuracy may do more harm than good and guide the user to choose passwords with a high score but actual low security.

The preferred method to measure the accuracy of a strength meter is to compare it to an ideal reference, measuring the similarity between the reference and the meter output. In our work, On the Accuracy of Password Strength Meters, we found the weighted Spearman's rank correlation coefficient to be a useful candidate to measure the accuracy of a strength meter compared to the ideal reference.

In this repository you find the necessary code to crawl and evaluate a password strength meter's accuracy. We hope that this code is helpful for meter developers to improve their implementation.

Related Work

Project Website

The original paper, a recording of the talk, the slides, and more information can be found at:

User Guide

The code consists of three parts: crawler, post-processing, and evaluation. As the crawler uses the Selenium framework, we automate a headless Google Chrome/Mozilla Firefox browser to crawl a meter using some predefined list of passwords. In a post-processing step we use a small Python script to prepare the crawled data for the evaluation step in R. In the final step, we calculate the weighted Spearman correlation to estimate the accuracy of the crawled meter.

Obtaining an Ideal Reference

While this repository contains some example passwords from the RockYou, LinkedIn, and 000Webhost, a more evolved evaluation requires more than one ground truth. The easiest way to obtain such references is by sampling passwords from password leaks, and evaluating them using CMU's Password Guessability Service (PGS). For more details please refer to the paper.

Installation

To keep the guide short, we assume the use of Ubuntu 22.04 LTS. All Python code snippets were tested using 3.10, all R scripts assume version 4.3 or later.

Check out the source code via:

$ git clone https://github.com/RUB-SysSec/Password-Strength-Meter-Accuracy.git PSMA

├── README.md
└── src
    ├── analyze
    │   ├── 01_build_r_file.py
    │   ├── 02_corr-comp.r
    │   ├── result_000webhost_offline.csv
    │   ├── result_000webhost_online.csv
    │   ├── result_linkedin_offline.csv
    │   ├── result_linkedin_online.csv
    │   ├── result_rockyou_offline.csv
    │   └── result_rockyou_online.csv
    ├── crawl
    │   └── 01_zxcvbn
    │       ├── 0_000webhost.offline.pw_guess_number_result.txt
    │       ├── 0_000webhost.offline.pw_score_result.txt
    │       ├── 0_000webhost.online.pw_guess_number_result.txt
    │       ├── 0_000webhost.online.pw_score_result.txt
    │       ├── 0_linkedin.offline.pw_guess_number_result.txt
    │       ├── 0_linkedin.offline.pw_score_result.txt
    │       ├── 0_linkedin.online.pw_guess_number_result.txt
    │       ├── 0_linkedin.online.pw_score_result.txt
    │       ├── 0_rockyou.offline.pw_guess_number_result.txt
    │       ├── 0_rockyou.offline.pw_score_result.txt
    │       ├── 0_rockyou.online.pw_guess_number_result.txt
    │       ├── 0_rockyou.online.pw_score_result.txt
    │       ├── zxcvbn_chrome.py
    │       └── zxcvbn_firefox.py
    ├── datasets
    │   ├── offline
    │   │   ├── 000webhost
    │   │   │   ├── 0_000webhost.offline.pw
    │   │   │   ├── 1_000webhost.offline.strength
    │   │   │   ├── 2_000webhost.offline.weight
    │   │   │   └── 3_000webhost.offline.withcount
    │   │   ├── linkedin
    │   │   │   ├── 0_linkedin.offline.pw
    │   │   │   ├── 1_linkedin.offline.strength
    │   │   │   ├── 2_linkedin.offline.weight
    │   │   │   └── 3_linkedin.offline.withcount
    │   │   └── rockyou
    │   │       ├── 0_rockyou.offline.pw
    │   │       ├── 1_rockyou.offline.strength
    │   │       ├── 2_rockyou.offline.weight
    │   │       └── 3_rockyou.offline.withcount
    │   └── online
    │       ├── 000webhost
    │       │   ├── 0_000webhost.online.pw
    │       │   ├── 1_000webhost.online.strength
    │       │   ├── 2_000webhost.online.weight
    │       │   └── 3_000webhost.online.withcount
    │       ├── linkedin
    │       │   ├── 0_linkedin.online.pw
    │       │   ├── 1_linkedin.online.strength
    │       │   ├── 2_linkedin.online.weight
    │       │   └── 3_linkedin.online.withcount
    │       └── rockyou
    │           ├── 0_rockyou.online.pw
    │           ├── 1_rockyou.online.strength
    │           ├── 2_rockyou.online.weight
    │           └── 3_rockyou.online.withcount
    └── meter
        └── 01_zxcvbn
            ├── eval.js
            ├── index.html
            ├── jquery-3.7.1.min.js
            └── zxcvbn_v4.4.2.js

Step 0: Preparation

Before we start, we need to install some dependencies, like Python PIP, Selenium, and a WebDriver for your browser.

Configuring Python and Installing Selenium

First we install Python Package Installer (PIP) and the Python virtual environment runtime environment.

$ sudo apt-get install python3-pip python3-virtualenv

We start by creating a new Python virtual environment that we just use for this project.

$ virtualenv -p /usr/bin/python3 venv

Next, we activate the Python virtual environment.

$ source venv/bin/activate

Now, we install selenium.

(venv) $ pip install selenium

Installing the WebDriver

By default, Ubuntu 22.04 LTS ships Firefox as a Snap package, which causes a lot of issues with the geckodriver.

We need to replace Firefox Snap with the Debian (deb) package, for this you best follow the detailed tutorial from OMG! Ubuntu!.

To allow Selenium to communicate and automate your browser, we also need to install your web browser's driver.

(venv) $ pip install webdriver-manager

The examples in this repo are for Google Chrome and Mozilla Firefox, if you use another browser (Brave, Chromium, Edge, Opera) have a look here.

Installing R

Install the R open-source programming language on your Ubuntu machine.

Optional: Install RStudio Desktop for more convenience.

Step 1: Crawling the Meter

In a first step, we need to get the estimates from the strength meter for a given set of passwords. For most web-based password strength meters we need the Selenium framework and a web browser (and its WebDriver!) to obtain such estimates. Our tutorial includes an example based on the zxcvbn strength meter.

While we try to explain everything in detail, we skip the part on how to install a web browser on your system, just use Google Chrome or Mozilla Firefox.

Note: If your meter is not a web-based strength meter, you need write some custom function that will output strength estimates for a given list of passwords found in the datasets folder.

Navigate to src/meter/01_zxcvbn/ and open index.html with your browser. Copy the path that is displayed in the URL bar, likely similar to:

file:///home/<username>/PSMA/src/meter/01_zxcvbn/index.html

Now edit src/crawl/01_zxcvbn/zxcvbn_chrome.py or src/crawl/01_zxcvbn/zxcvbn_firefox.py and change the path according to your environment.

...
    driver.get('file:///home/<username>/PSMA/src/meter/01_zxcvbn/index.html')
...

Save your edits!

Next, we make sure that the Python virtual environment is activated, and we change the current directory to src/crawl/01_zxcvbn/.

$ cd src/crawl/01_zxcvbn/
$ source ~/venv/bin/activate
(venv) $ python zxcvbn_chrome.py ../../datasets/online/linkedin/0_linkedin.online.pw

Alt text

Next, repeat the crawling for the different datasets (linkedin, 000webhost, rockyou) and the two scenarios (online and offline).

Step 2: Post-Processing

First, navigate to src/analyze/ and edit 01_build_r_file.py to your needs.

On the Terminal run:

$ cd src/analyze/
$ python 01_build_r_file.py

It will produce 6 files:

  • result_rockyou_online.csv
  • result_rockyou_offline.csv
  • result_linkedin_online.csv
  • result_linkedin_offline.csv
  • result_000webhost_online.csv
  • result_000webhost_offline.csv

Example: Contents of result_linkedin_online.csv:

password      strength       weight    count    zxcvbn_guess_number    zxcvbn_score
123456      -1044164.0    1044164.0     30.0                    2.0             0.0
linkedin     -193001.0     193001.0      6.0                22802.0             1.0
password     -176120.0     176120.0      4.0                    3.0             0.0
111111        -78720.0      78720.0      4.0                    9.0             0.0
...

Step 3: Evaluation using R

On Ubuntu 22.04 we need to install GNU Fortran

$ sudo apt install gfortran

and the Linear Algebra PACKage (LAPACK) and the Basic Linear Algebra Subprograms (BLAS):

$ sudo apt-get install libblas-dev liblapack-dev

Next, you can start R, e.g., by running RStudio or R:

> install.packages("wCorr")

This installs the support for Weighted Correlations in R.

Next, change <username> in the R script to the correct path:

setwd('/home/<username>/PSMA/src/analyze/')

Finally, in R, run 02_corr-comp.r, e.g., by executing:

$ Rscript 02_corr-comp.r

The output should look like this:

zxcvbn_guess_number     rockyou      online      0.812
zxcvbn_guess_number     linkedin     online      0.802
zxcvbn_guess_number     000webhost   online      0.435
zxcvbn_guess_number     rockyou      offline     0.772
zxcvbn_guess_number     linkedin     offline     0.908
zxcvbn_guess_number     000webhost   offline     0.904
zxcvbn_score    rockyou      online      0.455
zxcvbn_score    linkedin     online      0.376
zxcvbn_score    000webhost   online      0.355
zxcvbn_score    rockyou      offline     0.545
zxcvbn_score    linkedin     offline     0.670
zxcvbn_score    000webhost   offline     0.880
> # -1.0 means strong negative correlation; meter works but 'strong' passwords are in fact 'weak' and the other way round
> #  0.0 means no correlation; meter is randomly guessing, and not estimating password strength
> #  1.0 means strong positive correlation; meter works perfectly

FAQ

  • The crawling does not work! Make sure you have activated the virtual environment. Check for the presence of the (venv) in front of your prompt. Make sure you have the latest WebDriver installed on your system.

License

Our code in the Password-Strength-Meter-Accuracy repository is licensed under the MIT license. Refer to docs/LICENSE for more information.

Third-Party Libraries

  • zxcvbn is a password strength meter developed by Daniel Wheeler and Dropbox, Inc. and is using the MIT license. The license can be found here.
  • jQuery is a JavaScript library developed by the JS Foundation and is using the MIT license. The license can be found here.
  • wCorr is an R package developed by Ahmad Emad and Paul Bailey and is using the GPL-2 license. The license can be found here.
  • Selenium is a browser automation framework developed by ThoughtWorks and is using the Apache 2.0 license. The license can be found here.

Contact

Visit our website and follow us on Twitter. If you are interested in passwords, consider to contribute and to attend the International Conference on Passwords (PASSWORDS).