/heaptruffle

Mine URLs from Browser's Heap Snapshot for fun and profit

Primary LanguageJavaScript

heaptruffle

Mine URLs from Browser's Heap Snapshot for fun and profit

🏗️ Install ⛏️ Usage 💡 How it Works ⚡ Inspiration

heaptruffle

Installation

Follow these steps to get heaptruffle up and running:

  1. Clone the Repository:

    git clone https://github.com/devanshbatham/heaptruffle
  2. Navigate to the Directory:

    cd heaptruffle
  3. Build the Docker Image:

    docker build -t heaptruffle .
  4. Make the script executable and move it to a directory in your PATH:

    sudo chmod +x heaptruffle
    sudo mv heaptruffle /usr/local/bin/heaptruffle

Once done, you can invoke heaptruffle from any location in your terminal.

Usage

Using Docker:

  • To run heaptruffle on single URL

    docker run -it --rm -v "$PWD":/app/data --name heaptruffle-container heaptruffle --url http://example.com
  • or, to run it on a file containing URLs.

    docker run -it --rm -v "$PWD":/app/data --name heaptruffle-container heaptruffle --list urls.txt
  • Save the output to a file (output.txt):

    docker run -it --rm -v "$PWD":/app/data --name heaptruffle-container heaptruffle --url http://example.com --output /app/data/output.txt
  • Increase concurrency to fetch URLs faster:

    docker run -it --rm -v "$PWD":/app/data --name heaptruffle-container heaptruffle --list urls.txt --concurrency 10

Using heaptruffle alias (after installation):

  • To run heaptruffle:

    heaptruffle --url https://example.com
  • or

    heaptruffle --list urls.txt
  • Increase concurrency to fetch URLs faster:

    heaptruffle --list urls.txt --concurrency 10
  • Save the output to a file (output.txt):

    heaptruffle --url https://example.com --output output.txt
  • Use silent mode to suppress the ASCII banner:

    heaptruffle --url https://example.com --silent

Options

Option Alias Type Description
--url -u string URL address
--list -l string File containing list of URLs
--concurrency -c number Number of URLs to fetch concurrently (default: 5)
--silent -s boolean Silent mode, does not display the ASCII banner (default: false)
--output -o string File to save the output

How it Works

heaptruffle uses Puppeteer, a headless browser automation library, to load web pages and capture heap snapshots of the web pages' memory. These heap snapshots are then parsed using the heapsnapshot-parser library, allowing heaptruffle to extract URLs/endpoints from it.

The tool takes either a single URL or a file containing a list of URLs as input. It fetches each URL concurrently to speed up the process. For each URL, heaptruffle loads the web page, captures a heap snapshot, and then performs analysis to extract relevant paths from the snapshot. It identifies the URLs and paths accessed during the page's execution and outputs them to the console or a specified output file.

Inspiration

This tool was inspired by the project extract-relative-url-heapsnapshot by smiegles. I just improved it in my way and extended its functionality (concurrency, support for multiple URLs, pretty output, the ability to save the results in a file, dockerization, error handling, an easy-to-use setup script, etc).