/Airline-Reviews-

Python file for scrapping airline reviews

Primary LanguagePython

Airline Reviews

Source

The airline reviews dataset is collected from the website: https://www.airlinequality.com/review-pages/a-z-airline-reviews/

Methodology

The dataset is created using the following steps:

  1. Scraping: The names of all airlines are scraped from the website mentioned above.
  2. URL Formation: The URLs for each airline's review page are constructed based on the website's structure.
  3. Review Data Scraping: Each airline's review page is scraped to collect information regarding customer reviews.

Technology Stack

The data collection process involves the following technologies:

  • Beautiful Soup: Used for parsing the website and extracting relevant information.
  • Pandas: Utilized for data storage and conversion to CSV
  • Requests: Employed for making HTTP requests to fetch web pages.
  • Unicodedata: Used for handling and processing Unicode characters.

Dataset Availability

The refined dataset is available for analysis and exploration at: https://www.kaggle.com/datasets/juhibhojani/airline-reviews