Project: Predicting article retweets and likes based on the title using Machine Learning
You can read the full project on this file and check here the implementation
This is the final project of the specialization Machine Learning Engineer Nanodegree.
Abstract - Choosing a good title for an article is an important step of the writing process. The more interesting the article title seems, the higher the chance a reader will interact with the whole content. This project focus on predicting the number of retweets and likes on Twitter from FreeCodeCamp's articles based on its titles. This problem is a classification task using Supervised Learning. With data from FreeCodeCamp on Twitter and Medium, it was used machine learning methods including support vector machines (SVM), decision trees, gaussian naive Bayes (GaussianNB), k-nearest neighbors, logistic regression, gradient boosting and naive Bayes classifier for multinomial models (MultinomialNB) to make the predictions. This study shows that the MultinomialNB model performed better for retweets reaching an accuracy of 60.6% and logistic regression reached 55.3% for likes.
Keywords - prediction, machine learning, social media, title, performance
Dependecies
This project requires Python 2.7 and the following Python dependencies installed:
Run
In a terminal or command window, run one of the following commands:
ipython notebook title-success-prediction.ipynb
or
jupyter notebook title-success-prediction.ipynb
This will open the Jupyter Notebook software and project file in your browser.
Note
The Capstone is a two-staged project. The first is the proposal component, where you can receive valuable feedback about your project idea, design, and proposed solution. This must be completed prior to your implementation and submitting for the capstone project.