Machine Learning with Tinder - Information Retrieval (CS4320)

Background

For those of you whom aren't familiar, Tinder is a popular online dating platform typically used on mobile devices. A user is presented with pictures and other information about local singles allowing the user to "like" or "pass" on their profile.

Project Overview

This project collects information from Tinder profiles, analyzes the biography section text, and applies clustering to determine popular topics expressed.

Potential Project Expansion

Expansion on this project could include attempting to compose an "optimal" profile to maximize matches based on the collected information.

Results

Collection Statistics

2790 total profiles collected
2060 profiles with usable biography text

Procedure

Data Collection

Obtain a Facebook token to authenticate with the Tinder API.
Request the recommended profiles.
Save the biography text to a JSON file.
"Like" each recommended profile.
Repeat steps 2 - 3 until no more likes available.
Repeat steps 2 - 5 when more likes are available (every 12 hours).

Collection Radius:

Women within a 100 mile radius of Logan, UT from the ages of 18+

Data Analysis

Preprocess the text (Remove: Stop Words, Punctuation, Emojis, and Numbers).
Feature extraction from the documents (Each Tinder profile is a document): TF-IDF.
Perform clustering: Truncated SVD (Latent Semantic Analysis)

malctaylor15/Tinder