Lowes Crawling

Scrapy Application for Crawling in Lowes Store

GitHub top language Made by

About the project   |    Technologies   |    Getting started   |    License

👨🏻‍💻 About the project

Lowes Scrapy is a crawling script from the Lowes store, initially designed to return information from all refrigerators, but capable of doing this with any appliance. This is done by changing the url on line 9 at ./lowes/spiders/refrigerator.py.

Where you can use the links for the appliances described on this site: Lowes Appliances, such as: Washers & Dryers or Ranges.

The data obtained are located in the data folder, in the .csv files

[changes] - Added Puppeteer crawl, responsible for scraping product prices. It run from the CSV file generated by the scrapy crawl.

🚀 Technologies

Technologies that I used to develop this crawl

💻 Getting started

Requirements

Obs.: I recommend use docker

Clone the project and access the folder

$ git clone https://github.com/lucasleonardobs/lowes-scrapy.git && cd lowes-scrapy

Follow the steps below

# To run Scrapy Crawl
# Create a virtual environment using conda cli
$ conda env create -f envname.yml

# Activate the virtual env
$ source activate

# Run scrapy
$ scrapy crawl refrigerator -o refrigerators.csv

# To run Puppeteer Crawl (just after scrapy crawl finalization, because it uses the CSV file generated by Scrapy Crawl)

# Go to Puppeteer page
$ cd lowes-puppeteer

# Add node dependencies
$ yarn

# Run the Crawl
$ node src/index.js

# Well done, project is started, just wait to finish and consult the refrigerators.csv (and prices.csv, not work atually) file!

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with 💜 by Lucas Leonardo 👋 See my linkedin