/Scrap-Millions-of-Emails

This project is focused on scraping millions of emails dynamically from thousands of web pages automatically from the website [fredmiranda.com](https://fredmiranda.com/). The goal of this project is to create a dataset of email addresses that can be used for various purposes.

Primary LanguageJupyter Notebook

Scrap Millions of Emails using Selenium and Python

Introduction

This project is focused on scraping millions of emails dynamically from thousands of web pages automatically from the website fredmiranda.com. The goal of this project is to create a dataset of email addresses that can be used for various purposes.

Tools

  • Selenium
  • Python
  • IPython Notebook
  • CSV library

Usage

  1. Install Selenium and its dependencies:
  • pip install selenium
  1. Clone or download the repository from GitHub.
  2. Open the email_scraping.ipynb file using IPython Notebook.
  3. Install the required libraries mentioned in the first cell of the notebook.
  4. Change the URL in the url variable to the desired web page you want to scrape emails from.
  5. Run all the cells in the notebook. The code will start scraping emails from the web page and will keep running until all the pages are scraped.
  6. Once the code is finished running, a CSV file named email_dataset.csv will be created in the same directory as the notebook. The file will contain the email addresses scraped from the website.

Conclusion

This project demonstrates how to use Selenium and Python to scrape emails dynamically from thousands of web pages automatically. The dataset created can be used for various purposes and can also be easily exported to other formats.