Here you can find the python scripts used for the ENERGIC - Datathon http://vgibox.eu/activities/datathon-challenge/ . Please check: http://landcover.como.polimi.it/datathon/pages/methodology.html first.
here you'll find:
data_acquisition:
wunderapi.js
: download data from weather underground api
data_processing :
wunder.py
: weather underground data preprocessing
ts_cellid_onev.py
: telecom Italy open data preprocessing
exploration.py
: telecom Italy open data statistical exploration
preanova.py
: merging of telecom data and temperature data for anova analysis.
analysis :
twowayanova.R
: Two way ANOVA with unequal sample sizes
data :
cellid_wunder2013stations: table relating Milano grid cells id with weather underground stations
stationsdata.zip: weather data extracted from weather underground API nov - dic 2013
stationsdata_aggregated.zip: csv files of weather data aggregated by hour (mean) dic 2013
data_dic.csv: one csv file with weather data aggregated by hour (mean) dic 2013
Large files. Download follwing the dropbox link:
datamerged_dic.csv: one csv file with weather data aggregated by hour (mean), together with Milano grid cells id. dic 2013https://www.dropbox.com/s/ku3qvxi1v25zn80/datamerged_dic.csv?dl=0
callsout_dic.csv: one csv file with outgoing calls in the city of Milano during dic 2013
https://www.dropbox.com/s/8tsvmb8vl5zr25e/callsout_dic.tsv?dl=0
callsout_normal_dic.csv: one csv file with outgoing calls in the city of Milano during dic 2013, normalized (log(x))
https://www.dropbox.com/s/p54x3955ql86m0e/callout_normal_dic.csv?dl=0
anovadata.csv: one csv file with outgoing calls in the city of Milano during dic 2013, classified by day type (weekends and workdays) and temperature levels (high, medium, maximum)
https://www.dropbox.com/s/mbm5p2cfefuci7v/anovadata.csv?dl=0