N8n Ultimate Scraper is an open-source framework designed for N8n builders who want to use an efficient and scalable scraper for their projects.
Our web scraper can find any data displayed on a webpage and can also use cookies to log in to the targeted webpage.
The scraper takes the following input parameters:
- Subject (e.g., Hugging Face)
- Domain name/website (e.g., github.com)
- Targeted data (e.g., number of followers)
- Website cookies (optional for authentication)
The result of this request will be a structured JSON containing the number of followers on the Hugging Face GitHub page.
Docker Compose Setup for n8n and Selenium This repository contains a Docker Compose configuration for deploying n8n and Selenium standalone Chrome services.
Follow these steps to deploy the services using Docker Compose:
- Docker: Ensure Docker is installed on your system. You can download it from here.
- Docker Compose: Ensure you have Docker Compose installed. You can check by running:
docker-compose --version
Start by cloning the repository that contains the docker-compose.yml
file and the .env
file.
git clone https://github.com/Touxan/n8n-ultimate-scraper.git
cd n8n-ultimate-scraper
Modify the .env
file in the root directory modify the necessary environment variables for n8n
and selenium
.
# .env
# N8n Variables
N8N_BASIC_AUTH_ACTIVE=true
N8N_BASIC_AUTH_USER=n8nuser
N8N_BASIC_AUTH_PASSWORD=n8npassword
# Selenium Variables
HTTP_PROXY=http://proxyAdress:port
HTTPS_PROXY=http://proxyAdress:port
Make sure to replace n8nuser
and n8npassword
with more safer values.
Make sure to replace proxyAddress
and port
with the actual values for your proxy.
To deploy the n8n
and selenium
services, run the following command:
docker-compose up -d
This will:
- Spin up the
n8n
service and expose it on port5678
. - Spin up the
selenium
Chrome standalone service and expose it on port4444
. - Connect both services on a custom Docker network.
-
n8n: You can access
n8n
by navigating to http://localhost:5678 in your browser. -
Workflow: You can import on your n8n the Selenium_Ultimate_Scraper_Workflow.json and activate it
To request the workflow you can start with this cur command :
curl -X POST http://localhost:5678/yourwebhookid \
-H "Content-Type: application/json" \
-d '{
"subject": "Hugging Face",
"Url": "github.com",
"Target data": [
{
"DataName": "Followers",
"description": "The number of followers of the GitHub page"
},
{
"DataName": "Total Stars",
"description": "The total numbers of stars on the different repos"
}
],
"cookies": []
}'
Or to just scrap a url :
curl -X POST http://localhost:5678/webhook-test/67d77918-2d5b-48c1-ae73-2004b32125f0 \
-H "Content-Type: application/json" \
-d '{
"Target Url": "https://github.com",
"Target data": [
{
"DataName": "Followers",
"description": "The number of followers of the GitHub page"
},
{
"DataName": "Total Stars",
"description": "The total numbers of stars on the different repo"
}
],
"cookies": []
}'
To stop the running containers:
docker-compose down
This will stop and remove the containers, but your data will be persisted in the n8n_data
volume.
To extract cookies in the correct format for using a logged session with the scraper, you can use the extension provided in the repository.
Follow these steps to install a Chrome extension from a local folder:
Ensure you have the extension's files downloaded or cloned to a local folder on your computer.
Open Google Chrome and navigate to the extensions page by either:
- Typing
chrome://extensions/
into the address bar, OR - Clicking the three dots (menu) in the top-right corner →
More Tools
→Extensions
.
On the extensions page, toggle on Developer mode. You’ll find this option in the upper-right corner of the page.
Click the Load unpacked button (appears after enabling Developer mode).
A file browser will open. Navigate to the folder where the extension files are stored.
Select the folder containing the extension’s manifest.json
file and click Select Folder.
The extension will appear on the extensions page. Make sure it is enabled. You should now see the extension icon in the Chrome toolbar, or you can manage it from the extensions page.
That's it! Your extension is now installed and ready to use from the folder.
If you're experiencing any issues getting started with n8n-ultimate-scraper, you can:
- Opening a GitHub issue describing your issue