/base.gov.pt

A crawler that fetches data from base.gov.pt

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

base.gov.pt

A crawler that fetches data from http://www.base.gov.pt/.

Spiders

Usage

Download the spider and its dependencies:

git clone 'https://github.com/ajcerejeira/base.gov.pt.git'
cd base.gov.pt/
pip install -r requirements.txt

And then run the desired spider:

scrapy crawl get_contracts

This will generate the following files:

  • contracts.csv - main table, containing the most important info regarding the contracts
  • contestants.csv
  • invitees.csv
  • documents.csv
  • places.csv

Please be patient, since it takes some hours before it completes (on my machine it took about 26 hours to finish gathering all data).