This repository/project is intended for Educational Purposes ONLY. The project and corresponding python script should not be used for any purpose other than learning. Please do not use it for any other reason than to learn about webscrapping. Make sure you adhere to the terms and conditions of the site!
This script is designed to streamline the process of extracting B2B leads from registers - specifically the Austrian Wirtschaftskammern portal. It automates the tedious task of manually sorting through over a thousand profiles to determine their relevance and to extract necessary contact information. The script outputs data such as the name, address, email, website, and business type from each profile and neatly organizes it into a structured Excel table for easy use in your marketing initiatives.
These instructions will guide you on how to run the script on your local machine for development and testing purposes.
The script requires the following Python packages:
- requests
- beautifulsoup4
- pandas
- openpyxl
You can install these packages using pip:
pip install -r requirements.txt
- Customize the website filters on this link according to your needs.
- Copy the adapted link and replace the
url
variable in the script. - Run the script.
For example:
# URL to scrape data from
url = "https://firmen.wko.at/-/wien/?branche=25376&branchenname=immobilienmakler&page=1"
Then execute the Python script in your preferred environment.
Contributions are what make the open source community an incredible place to learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the project.
- Create your feature branch (
git checkout -b feature/AmazingFeature
). - Commit your changes (
git commit -m 'Add some AmazingFeature'
). - Push to the branch (
git push origin feature/AmazingFeature
). - Open a pull request.
Distributed under the MIT License. See LICENSE
for more information.
Feel free to reach out if you have any questions or if there's anything else I can do to help!