/user-location-clustering

Project: Clustering Location Histograms. Completed as part of KCL MSc Data Science Dissertation, 2018

Primary LanguagePython

Clustering User Histograms

This project was done as a part of a dissertation submitted for KCL MSc Data Science, 2018, and scored a Distinction

User histograms are created from a variety of data found in check-in services. One such service which provides readily available information is Foursquare, and their data can be readily found on Kaggle

Paper Abstract

This paper aims to find and test algorithms that would successfully cluster users using Foursquare’s check-in data from New York. Users in the dataset will be represented as a histogram, as a matrix of check-ins, described by the venue categories they have checked into. In this paper, I find a novel method to cluster users histograms according to the semantic spaces they have visited, rather than directly using venue categories they have checked into. This has a two-fold effect of reducing dimensionality of the histograms, and possibly gaining better understanding of user clusters. This paper will contribute to the understanding of clustering users from Location Based Social Networks, which may pave the way for better applications in location recommendation or collaborative filtering in the future.