calzone3: A Jupyter Notebook repository from lwgray

What's in a name?

A tutorial on my machine-learning workflow for predicting whether or not this post will be popular

Dear Reader,

When writing an article, blog, or post we all worry about whether others will read it, like it, or even share it. I too suffer from this common anxiety; I want my writings to be popular! I often wondered why my posts received less clicks/likes than another's. On the surface, the only obvious differences were the title of the posts. With this thought in mind, I began to formulate a hypothesis. Maybe the title/name of a post correlates with its popularity? If this is so, then maybe I can reverse-engineer the process and pick only popular titles.

This leads us to the purpose of this article. The purpose of this article is to describe my efforts to predict whether or not a post to the /r/datascience subreddit will be popular. I define popularity as receiving more than the average number of upvotes. I take a unique approach in making this prediction. My prediction methodology is based solely on the title of the redditor's post, hence this blog's title: What's in a name?.

Visit classify.ipynb to start the tutorial

Checkout a live version of it @here

lwgray/calzone3

What's in a name?

A tutorial on my machine-learning workflow for predicting whether or not this post will be popular