Inferring Social Ties from Human Mobility Patterns
This repository contains code that was used in the thesis "Human mobility Patterns at Large-Scale Events".
The figure above shows how the data pipeline that has been built to infer social ties.
Basic flow of pipeline:
- PostgreSQL database schema is built using bash scripts (see
pipeline/build-schema
) - Data set is ingested into schema using
pg_dump
. - Database schemas are setup using bash scripts (see
pipeline/build-indexes
) - Features are computed using SQL stored procedures (see
pipeline/sql-pipeline
) - Based on these features supervised learners are trained using Python and
sklearn
(seepipeline/train-models
) - Social graph is build using Python and
networkx
(seepipeline/build-social-graph
) - Community detection is applied using
infomap
(seepipeline/graph-statistics
) - Various graph statistics are applied using
networkx
(seepipeline/graph-statistics
)
Finally, we have throughout the process relied on smaller scripts to plot data and do smaller data tasks.
Folder structure:
.
|-- app
| |-- static
| |-- templates
| |-- app.py
| `-- config.py
|-- data
| |-- polygons
| `-- training-data
|-- figures
|-- misc-scripts
|-- pipeline
| |-- build-indexes
| |-- build-schema
| |-- build-social-graph
| |-- graph-statistics
| |-- sql-pipeline
| `-- train-models
|-- README.md