/reads

Real Estate Agency Data Scraper

Primary LanguagePythonMIT LicenseMIT

R.E.A.D.S - Real Estate Agency Data Scraper

Project built to crawl Real Estate Agency websites. It can get the price, location and anything else.

Built using the tool Scrapy, a Python framework to extract data from web pages.

This project actually have spiders for the following websites:

Country Agency
Brazil Stória Imóveis
Brazil ImovelWeb
Brazil ZapImóveis
Brazil VivaReal

Dependencies

Major

Package Version
Python v3.6.5

Python

Package Version
Selenium v3.12.0

Extra

Package Version
GeckoDriver¹ v0.20.1

¹ : Geckodriver also can be installed using the command npm install -g geckodriver

How to

Clone the repository

To clone the repository, run in the command line:

$ git clone http://github.com.br/MatheusDosReis/real-estate-agency-scraper

$ cd real-state-agency-scraper

Install python dependencies

Run the command bellow:

$ pip install -r requirements.txt

Create the result's folder

Run the command:

$ mkdir results

Usage

Spiders available

List of names of the available spiders:

  • storia
  • imovelweb
  • zapimoveis
  • vivareal

Run a spider

To crawl a specific spider:

scrapy crawl <name_of_the_spider>