CityMe
The CityMe project aims at creating a framework for mapping, exploring and analyzing official and non-official neighborhoods, regions of interest, districts and other areas that constitute people’s mental map of the city. By harvesting data from our map-based application and social media platforms, we can better understand how citizens spatially reason about administrative and non-administrative regions in the urban landscape, such as parishes, residential areas, informal neighborhoods, historical centers and commercial areas.
Through applying state-of-the-art spatial analysis, our framework can be applied to other cities in order to improve and enhance participatory urban planning, citizen-engagement in public policies and projects in smart cities.
CityMe is not only restricted to characterizing different types of regions in the city, but it is also committed to contributing to the efforts of making cities more human through citizen-centered, transparent and bottom-up approaches in various applications in the context of urban intelligence.
🎓 Objectives
- Make available a web map survey and questionnaire interface
- Perform spatial analysis to compare and characterize the regions from surveyed data and user-generated content from social media
- Evaluate survey results based on sociodemographic attributes
- Explore results based on administrative boundaries, official names, census blocks, urban morphology, landmarks and points of interest
Contents
Data platforms that support user-generated geo-tagged content were used for this project. The relevant sources are listed below Data Sources
- Flickr
- Wikipedia
- Idealista
- Google Places
Prerequisites
- Postgres 14.1
- Python 3.10
- GDAL 3.4.1
Database Setup
Following parameters are required to be configured in each of the ingestion file
database = ""
user = "postgres"
password = "postgres"
host = "localhost"
port = 5432
table_name = ""
Installation
Setup Python Environment
git clone https://github.com/CityMe-project/cityme.git
conda install -n py10 python=3.10
conda activate py10
pip3 install -r requirements.txt
Usage
All .ipynb
files show an example of how to dump data into Postgres
Fetching 10,000 tweets using bounding box. More filters for query
can be referred from here Twitter API
The API requires configuration of API keys in creds.yaml
. To modify filters of the command, refer here
python ingestion/twitter/search-tweets-python/scripts/search_tweets.py --credential-file creds.yaml --max-pages 1 --max-tweets 10000 --output-format a --results-per-call 100 --query "(bounding_box:[-9.237994149198911 38.68001599304764 -9.088636579438045 38.801883026428158] has:geo)" --start-time "2015-01-01T01:00" --end-time "2020-04-15T20:37" --tweet-fields author_id,created_at,geo,id,source,text --place-fields country,geo,name,country_code,full_name --expansions geo.place_id --filename-prefix twitter_data --no-print-stream --debug --results-per-file 5000
Instagram posts can be downloaded for specific hashtags
.
Possible post types are top
and recent
Installation
pip install instagrapi
Basic Example
python .\hashtag_downloader.py --u username --p password --posts 1000 --keyword lisbon --type top
To continue scraping from a specific point, use the cursor stored in insta_hash_cursor.txt
.
insta_duplicate_map.txt
is used to keep a track of all the insta posts scrapped in order to avoid duplicates as it is not checked by the API even in the same session
python .\hashtag_downloader.py --u username --p password --posts 5000 --keyword lsibon --type recent --cursor d3a84e8b0618449a9e9c102c5cf70c01
Flickr
This official API enables downloading geo-tagged images. Please configure the keys in flickr.ipynb
Idealista
The research-only API enables downloading geo-tagged real-estate listing. Please configure the keys in idealista.ipynb
Wikipedia
pip install wikipedia
Age/Gender Identification
Identify age/gender of a user using the profile picture. Use identify.ipynb
and place your images in input_images
. Download the pretrained model from here. Not all users may be identified.
In the second part of script, you can use full name of users to identify if they are Portuguese or not and filter location based on a region/city
column
Citation Brandt, J., Buckingham, K., Buntain, C., Anderson, W., Ray, S., Pool, J.R., Ferrari, N. 2020. Identifying social media user demographics and topic diversity with computational social science: a case study of a major international policy forum; Journal of Computational Social Science; https://doi.org/10.1007/s42001-019-00061-9
Google Maps POI
This requires the use of your Google Maps API
key. This code is not responsible for any costs that you might incur. The script works by reading an input shapefile
from input_grid
folder and reading the points lat/lon
values. The POIs are extract from around these point within a 250
meter radius and reviews of the final outputs are then extracted as well.
Use poi.ipynb
to run the script
AirBnB
Airbnb data was used to collect reviews about accomodations in Lisbon from here