/CityMe-UGC

Primary LanguagePythonMIT LicenseMIT

CityMe

Website GitHub last commit PyPI - Python Version

The CityMe project aims at creating a framework for mapping, exploring and analyzing official and non-official neighborhoods, regions of interest, districts and other areas that constitute people’s mental map of the city. By harvesting data from our map-based application and social media platforms, we can better understand how citizens spatially reason about administrative and non-administrative regions in the urban landscape, such as parishes, residential areas, informal neighborhoods, historical centers and commercial areas.

Product Page

Through applying state-of-the-art spatial analysis, our framework can be applied to other cities in order to improve and enhance participatory urban planning, citizen-engagement in public policies and projects in smart cities.

CityMe is not only restricted to characterizing different types of regions in the city, but it is also committed to contributing to the efforts of making cities more human through citizen-centered, transparent and bottom-up approaches in various applications in the context of urban intelligence.


 
 

🎓 Objectives

  • Make available a web map survey and questionnaire interface
  • Perform spatial analysis to compare and characterize the regions from surveyed data and user-generated content from social media
  • Evaluate survey results based on sociodemographic attributes
  • Explore results based on administrative boundaries, official names, census blocks, urban morphology, landmarks and points of interest

Contents

Contents
  1. Data Sources
  2. Prerequisites
  3. Database
  4. Installation
  5. Usage
  6. Authors

Data Sources

Data platforms that support user-generated geo-tagged content were used for this project. The relevant sources are listed below
  • Twitter
  • Instagram
  • Flickr
  • Wikipedia
  • Idealista
  • Google Places

Prerequisites

  • Postgres 14.1
  • Python 3.10
  • GDAL 3.4.1

Database Setup

Following parameters are required to be configured in each of the ingestion file

database = ""
user = "postgres"
password = "postgres"
host = "localhost"
port = 5432
table_name = ""

Installation

Setup Python Environment

git clone https://github.com/CityMe-project/cityme.git
conda install -n py10 python=3.10
conda activate py10
pip3 install -r requirements.txt

Usage

All .ipynb files show an example of how to dump data into Postgres

Twitter

Fetching 10,000 tweets using bounding box. More filters for query can be referred from here Twitter API

The API requires configuration of API keys in creds.yaml. To modify filters of the command, refer here

python ingestion/twitter/search-tweets-python/scripts/search_tweets.py --credential-file creds.yaml --max-pages 1 --max-tweets 10000 --output-format a --results-per-call 100 --query "(bounding_box:[-9.237994149198911 38.68001599304764 -9.088636579438045 38.801883026428158] has:geo)" --start-time "2015-01-01T01:00" --end-time "2020-04-15T20:37" --tweet-fields author_id,created_at,geo,id,source,text --place-fields country,geo,name,country_code,full_name --expansions geo.place_id --filename-prefix twitter_data --no-print-stream --debug --results-per-file 5000

Instagram

Instagram posts can be downloaded for specific hashtags. Possible post types are top and recent

Installation

pip install instagrapi

Basic Example

python .\hashtag_downloader.py --u username --p password --posts 1000 --keyword lisbon --type top

To continue scraping from a specific point, use the cursor stored in insta_hash_cursor.txt. insta_duplicate_map.txt is used to keep a track of all the insta posts scrapped in order to avoid duplicates as it is not checked by the API even in the same session

python .\hashtag_downloader.py --u username --p password --posts 5000 --keyword lsibon --type recent --cursor d3a84e8b0618449a9e9c102c5cf70c01

Thanks To instagrapi


Flickr

This official API enables downloading geo-tagged images. Please configure the keys in flickr.ipynb


Idealista

The research-only API enables downloading geo-tagged real-estate listing. Please configure the keys in idealista.ipynb


Wikipedia

pip install wikipedia

Age/Gender Identification

Identify age/gender of a user using the profile picture. Use identify.ipynb and place your images in input_images. Download the pretrained model from here. Not all users may be identified.

In the second part of script, you can use full name of users to identify if they are Portuguese or not and filter location based on a region/city column

Citation Brandt, J., Buckingham, K., Buntain, C., Anderson, W., Ray, S., Pool, J.R., Ferrari, N. 2020. Identifying social media user demographics and topic diversity with computational social science: a case study of a major international policy forum; Journal of Computational Social Science; https://doi.org/10.1007/s42001-019-00061-9

Inherited From This Repo

Google Maps POI

This requires the use of your Google Maps API key. This code is not responsible for any costs that you might incur. The script works by reading an input shapefile from input_grid folder and reading the points lat/lon values. The POIs are extract from around these point within a 250 meter radius and reviews of the final outputs are then extracted as well.

Use poi.ipynb to run the script

AirBnB

Airbnb data was used to collect reviews about accomodations in Lisbon from here

❤️ Authors

Vicente De Azevedo Tang
Contact | Github

Jaskaran Singh Puri
Contact | Github