/pentest_rl

Primary LanguagePythonMIT LicenseMIT

Password Guessing Agent using Reinforcement Learning

Introduction

This repository contains an implementation of a reinforcement learning agent trained to guess passwords. The agent is trained on a dataset of one million Russian passwords and subsequently tested against the extensive 'rockyou' password list. The agent uses the patterns it learns to discover frequently encountered passwords not present in the original dataset.

Table of Contents

Features

  • ε-greedy policy: Helps the agent balance exploration and exploitation efficiently.
  • Q-learning algorithm: Enables the agent to learn optimal password-guessing strategies.
  • Backpropagation of rewards: Ensures the agent credits earlier successful actions appropriately.

Setup and Usage

  1. Clone the repository:

    git clone https://github.com/davidalami/pentest_rl.git
    cd pentest_rl
  2. Download the training and testing wordlists

     wget https://github.com/sharsi1/russkiwlst/raw/master/stat_russkiwlst_top_1M.txt
     wget https://github.com/zacheller/rockyou/raw/master/rockyou.txt.tar.gz
  3. Run the password-guessing agent:

    python simple_q_learning.py

Implementation Details

The agent is based on a Q-learning reinforcement learning algorithm, which learns the best actions to take in different states (partially guessed passwords) to maximize the expected cumulative reward (correctly guessing the complete password).

Environment

The environment consists of the list of passwords the agent is trained and tested on. Each password can be seen as a sequence of states (characters) that the agent needs to guess correctly.

Agent

The agent interacts with the environment, taking actions (guessing characters) based on its ε-greedy policy and updating its Q-values as it learns the consequences of its actions (rewards or penalties).

Reward

The reward is a numerical value that indicates the outcome of the agent's action. In this implementation, a positive reward is given for a correct guess, while a negative reward is given for an incorrect guess.

Future Directions

  • Improving Efficiency: Enhancing the reward system by giving partial rewards to geometric keyboard sequences, vocabulary words, and words typed in switched keyboards can help the agent learn more complex patterns faster.