Inspired largely by the project introduced here, I walked through the process with a different town, to help out a friend. Along the way, I learned a bit about the awesome scrapy library, as well as PostgreSQL. Hope you like what I've done! Plots on the way.
The data in this repo was obtained via the command
wget http://ec2-54-235-58-226.compute-1.amazonaws.com/storage/f/2013-05-12T03%3A50%3A18.251Z/dcneighorhoodboundarieswapo.geojson
This geoJSON is courtesy of OpenDC Data, so thanks to them!
The file make_raw_data_table.sql
is a postgres statement to make the table which is then loaded via the web scraping tool
You will also need to download a couple of libraries. I set this up with Anaconda and it was super easy -- pip was a bit more of a headache and, as it often does, required quite a few apt-gets for various headers. I found it was only possible if the virtualenv I created was created with the --system-site-packages
flag; then pip install -r requirements.pip.text
worked out okay. And then... .so
linkage fail.
But seriously, just do the Anaconda thing already. Everybody's doing it
or even better
conda create -n new environment --file requirements.txt
source activate dcapa
Activate your venv from above, or in a plain old environment, execute
scrapy crawl aptspider