Developed software that predicts company’s LinkedIn post’s engagement through follower count, reactions, comments, and hashtag data using machine learning and Python.
Contains information about the company: Number of followers Number of connections Contains information about LinkedIn Company pages posts: Content of the post Number of hashtags used Number of reactions Number of comments
Contains information about the influencer: Number of followers Number of connections Contains information about LinkedIn Influencers’ posts: Content of the post Number of hashtags used Number of reactions Number of comments
- A metric that is a weighted average of reactions and comments was created for both datasets
- This metric is used as a proxy for the popularity of each post
- Low relative frequency, brand-specific hashtags show that there are not specific hashtags among popular posts that attract activity
- Specific brand and content are like drivers of high activity, not use of certain hashtags
Goal is to accurately measure popularity through LinkedIn Continuous, quantitative y-variable using various predictors in data set Measure popularity based on comments and reactions Assigned weights Weight of comments: ⅔ Weight of reactions: ⅓ Assigned greater weight to the comments due to extra effort it takes to write a comment Popularity does not distinguish between positive and negative interactions Only using the company data
Filtered out all rows with time less than one day Note: May create minimal bias Change all quantitative data from strings to floats Remove commas in large numbers Drop unnecessary columns Location, media_type, about, media_urls, hashtags, etc. Split data into training and testing sets
Original Model: Decision tree regressor
Predictors: followers, connections, number of hashtags
R^2: 0.2257747647065554
MSE: 203874.616921866
Second Model: Random Forest
Predictors: followers, connections, number of hashtags
MSE: 205996.68259815138
Second Model: Linear Model
Predictors: followers, connections, number of hashtags
R2: 0.012472456696375881
MSE: 103468.70309077845
- Followers, connections, number of hashtags did not seem to be strong predictors of popularity by themselves
- The content of the post may be more useful in predicting popularity
Including the content of the post in the model
We are working on pre-processing the content
Lowercasing
Removing punctuation
Removing stopwords
Tokenizing
Considering hashtags
Why is understanding how to predict the popularity of a post important?
Social media would be able to better make recommendations to companies looking to grow their platform about what type of posts users tend to interact with
Improve the user experience for companies
Help companies produce content to reach other uses and that other users are interested in