Marriott Hotels Scraping Assignment

This is a Python-based web scraper that extracts information about Marriott Hotels. The scraper is built using Scrapy, an open-source and collaborative web crawling framework written in Python.

Requirements

The following software is required to run the scraper:

Python 3.10
Scrapy

Installation

Clone the repository:

git clone https://github.com/rsumit123/mariott-scraping-assignment

Go inside the repo
```
cd mariott-scraping-assignment
```
Open a pipenv shell
```
pipenv shell
```
Install the required packages:
```
pienv install
```

Usage

Navigate to the project directory:
```
cd scraping-assignment
```
Run the scraper:
```
python hotel_scraper.py
```
The scraper will start running and will extract information about hotel urls listed in the code, the csv output of all the mentioned urls can be found inside the output folder.

Output

The extracted data will be saved in a CSV file named output/hotel_code_data.csv. The CSV file will contain the following columns:

checkin: The checkin time
PerNight: per night time without taxes
checkout: Checkout time
roomname: Roomname
checkout: Checkout time
ratename: Name of the rate ex: Flexible
StayTotalwTaxes: price including taxes
currency: currency
availability: Availability of the room (NA if not present)
cancelpolicy: cancellation policy if mentioned else NA
paymentpolicy: payment policy if mentioned else NA

License

This project is licensed under the MIT License. See the LICENSE file for more information.