The challenges are here.
All data are from the Titanic disaster (it reminds you Kaggle ?)
Scrapy works only with Python 2.7.
Please install Python 2.7, and not Python 3.x!
git clone https://github.com/fabienvauchelles/scraping-challenge-workshop.git
cd scraping-challenge-workshop
pip install -r requirements.txt
Scraper code is inside the file myscraper/spiders/myscraper.py
.
Items are inside the file myscraper/items.py
.
cd scraping-challenge-workshop
scrapy crawl myscraper -t jsonlines -o persons.json
Exports items are inside the file persons.json
.
See the Licence.