Yelp Restaurant data scraping using python, scrapy spider

Deployment

1. Clone Repository

  git clone https://github.com/farukalampro/yelp-webscraper-using-scrapy-python.git

  cd yelp-webscraper-using-scrapy-python

2. Create Virtual Environment

  python -m venv env

For Windows:

  .\env\Scripts\activate

For macOS/Linux:

  source env/bin/activate

3. To install required packages

  pip install -r requirements.txt

4. Input your own link from yelp.com

Go to the data.py file. Insert link from Yelp
I have added one link in data.py as a sample. You can insert as many links as you want.

      start_urls = [
        # This is the sample URL
        # Here you have to put your own search link
        'https://www.yelp.com/search?find_desc=Restaurants&find_loc=San+Francisco%2C+CA' 
    ]

5. Run the command in the terminal

  scrapy crawl data -o sample_file.csv

you can download the data in any format. I have given the format below

  scrapy crawl "spider name" -o file_name.csv/json/xml

Here we have scraped some restaurant data which is in the Sample File folder

Important Note

As Yelp is continuously updating its website, so make sure you are updating xpath

farukalamai/yelp-webscraper-using-scrapy-python

Yelp Restaurant data scraping using python, scrapy spider

Deployment

1. Clone Repository

2. Create Virtual Environment

3. To install required packages

4. Input your own link from yelp.com

5. Run the command in the terminal

Important Note