Screenshot Bot is a Node.js application designed to automate the process of capturing full-page screenshots of websites. It can handle multiple URLs concurrently, generating both mobile and desktop screenshots for each URL provided in a CSV file. The bot ensures efficient processing by limiting the number of concurrent page loads and provides real-time feedback on the progress via a console-based progress bar.
- Concurrent Processing: Handles multiple URLs simultaneously with configurable concurrency.
- Full-Page Screenshots: Captures the entire webpage for both mobile (500px wide) and desktop (1920px wide) views.
- CSV Input and Output: Ingests URLs from a CSV file and logs the result (success or error) to an output CSV.
- Automatic Directory Creation: Stores screenshots in a timestamped directory, ensuring organized output.
- Progress Bar: Displays real-time progress updates in the console, including status messages for each step.
- Node.js (v14 or higher)
- NPM (Node Package Manager)
-
Clone the repository:
git clone https://github.com/shayclark/screenshotbot.git cd screenshotbot
-
Install dependencies:
npm install
-
Prepare the input CSV file:
-
Create a file named
urls.csv
in the root directory of the project. -
The file should contain one URL per line, like so:
https://example.com https://another-example.com
-
-
Run the bot:
To start the screenshot process, run the following command in your terminal:
node index.js
-
Understanding the Output:
-
Screenshots:
-
Screenshots are saved in a timestamped directory under
output/
. -
Each screenshot is saved with a filename that corresponds to the URL, followed by
-mobile
or-desktop
, and saved as a.png
file. -
Example directory structure:
screenshotbot/ ├── output/ │ ├── 20240820123456/ // Timestamped folder │ │ ├── output.csv // Log of the run │ │ └── screenshots/ │ │ ├── example_com-mobile.png │ │ └── example_com-desktop.png ├── urls.csv ├── package.json └── index.js
-
-
Output CSV:
- The bot generates an
output.csv
file inside the timestamped folder. - The CSV file logs the status of each URL (e.g.,
success
,error - 404
, etc.).
- The bot generates an
-
-
Customizing the Concurrency:
- The script processes URLs in batches to manage memory usage.
- You can adjust the concurrency limit by modifying the
MAX_CONCURRENT_PAGES
variable inindex.js
.
During execution, the bot displays a progress bar in the console. This bar shows:
- The current progress (number of URLs processed).
- The percentage completed.
- A status message indicating the current operation (e.g., "Loading URL", "Taking screenshot", etc.).
If a URL returns a status other than 200, the bot logs the error in the output CSV and skips taking screenshots for that URL. Common error codes like 301, 404, etc., are noted in the CSV file next to the URL.
Suppose you have a list of 50 URLs you want to monitor regularly. With this bot, you can automate the process of capturing screenshots for both mobile and desktop views, organize them by date, and have an error log if something goes wrong with specific URLs.
- Out of Memory: If your machine runs out of memory, try reducing the
MAX_CONCURRENT_PAGES
variable. - No Output: Ensure that your
urls.csv
is correctly formatted with one URL per line.
This project is licensed under the GNU General Public License v3.0. You can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.