This project is a Python-based educational demonstration of how to scrape emails from web pages using Selenium and BeautifulSoup. The purpose of this project is to provide a hands-on learning experience for those interested in web scraping and data extraction.
The script navigates through Google search results, looking for LinkedIn profiles related to specific marketing tags. It then extracts email addresses found on these pages. The extracted emails are stored in a DataFrame along with the associated tag and country, and finally exported to a CSV file.
- Python
- Selenium WebDriver
- BeautifulSoup
- pandas
- Install the required Python libraries with pip:
pip install selenium beautifulsoup4 pandas
- Run the script:
python main.py
This project is for educational purposes only. Web scraping should be done responsibly and in accordance with the terms of service of the website being scraped. Always respect privacy and do not use this for spam or any form of unsolicited communication.
- Implement a more robust error handling system.
- Improve the email extraction process to reduce false positives.