This Node.js script utilizes Puppeteer for web scraping and ExcelJS for exporting data to Excel. The script is designed to extract product details from a specific website, and save the information in an Excel file.
- puppeteer: Node library for browser automation.
- fs: Node.js File System module for file operations.
- ExcelJS: Node library for working with Excel files.
- https: Node.js HTTP module for making HTTPS requests.
- urlToBase64(url): Converts an image from a URL to a base64-encoded string.
- add_row(product_details, row_index): Adds a row to the Excel file for a given product.
- Install dependencies:
npm install puppeteer exceljs
- Run the script:
node app.js
- The script will navigate through product pages, collect details, and save the information in
output.xlsx
.
- The script assumes a specific HTML structure on the target website. Changes in the structure may require script modifications.
- The headless browser is set to visible (
headless: false
). For production, set it toheadless: "new"
for a background operation. - Additional error handling may be needed based on specific use cases.
Feel free to adapt the script to your requirements and consult the Puppeteer and ExcelJS documentation for further customization.