/Wikipedia-Web-Scraping-Python-Project

Wikipedia scraper for extracting and organizing data on the largest companies in the United States by revenue into a structured CSV file.

Primary LanguageJupyter Notebook

Wikipedia Web Scraping Python Project

Description: This Python script utilizes BeautifulSoup and requests libraries to scrape data from the Wikipedia page listing the largest companies in the United States by revenue. The data is then processed and converted into a structured DataFrame using the Pandas library. The final step involves exporting this data to a CSV file.

Usage:

  1. Clone the repository: git clone https://github.com/SaiSurajMatta/Wikipedia-Web-Scraping-Python-Project
  2. Install the required dependencies: pip install beautifulsoup4 requests pandas
  3. Run the file : Wikipedia_Web_Scraping_Project.ipynb

Requirements:

  • Python 3
  • BeautifulSoup
  • Requests
  • Pandas

How to Contribute:

  1. Fork the repository.
  2. Create a new branch: git checkout -b feature/new-feature.
  3. Make your changes and commit them: git commit -m 'Add new feature'.
  4. Push to the branch: git push origin feature/new-feature.
  5. Create a pull request.