This repo contains code to scrape player statistics from the achahockey.org web page. This is a proof of concept and being developed for use to minimize labor for data entry.
After cloning the repo you can scrape data as follows:
$ scrapy crawl acha
Scrapy supports several standards for storing scraped data. In order to store them in JSON, CSV or XML execute the respective command:
$ scrapy crawl acha -o items.json -t json
$ scrapy crawl acha -o items.csv -t csv
$ scrapy crawl acha -o items.xml -t xml
There is now an automated script for running the scraping routines as well. This is for future use in CGI on a nearlyfreespeech web server.
$ python crawl.py
The previous will automatically scrape and store the data in a json file
For more information on how to use Scrapy please see the Scrapy Reference
This is an open source project. Feel free to fork it and submit pull requests at will.