This is a web scraping project built with Scrapy to extract restaurant reviews from TripAdvisor. The project allows you to scrape review data such as reviewer names, review text, and ratings from a specific location on TripAdvisor.
Before you can use this web scraper, ensure you have the following installed on your system:
- Python (3.6 or higher)
- Scrapy
You can install Scrapy using pip:
pip install Scrapy
Getting Started Clone this repository to your local machine:
git clone https://github.com/yourusername/tripadvisor-scraper.git
Navigate to the project directory:
cd tripadvisor-scraper
Set up your Scrapy project by creating a new Scrapy spider. You can modify the spider code to customize the scraping process for your specific needs.
Usage To run the web scraper, follow these steps:
Ensure you are in the project directory.
Run the Scrapy spider with the following command:
scrapy crawl tripadvisor_reviews
The spider will start scraping review data and save it to a CSV file named tripadvisor_reviews.csv in the project directory.
Review the scraped data in the CSV file or process it as needed.
Scraped Data The spider scrapes the following data fields:
reviewer_name: The name of the reviewer. review_text: The text of the review. rating: The rating of the restaurant. Contributing If you would like to contribute to this project, feel free to create a pull request with your changes or open an issue if you have any suggestions or questions.
License This project is licensed under the MIT License - see the LICENSE file for details.