This material is an improvement of my Data Analytics Intern. During that Intrenship, My main task is to improve their marketing strategies on social media platforms to improve the number of followers and brand awareness.
I mainly used natural language processing on SAS enterprise miner. I studied 10,000 tweets from companies with a large number of posts, and the tweets with more than 50 reposts were labeled as popular, otherwise it was not popular. I processed the text by typical NLP processing methods and trained supervised machine learning models.
My final model was ensemble model with the best AUC, and my insights mainly came from decision tree because higher explain ability. I found the length of characters, proper posting time, usage of hashtags and dynamic and positive words, are important factors for posts to be popular. I presented the report to the marketing team, and the followers did increase.
This experience gave me a preliminary understanding of NLP, and I felt it was interesting and useful, so I took this course in my master's program, and gained a further understanding in the field of text mining. After studying this course systematically, I used the same data to implement it on Python and compared the pros and cons of different tools to solve the same problem. In the future, if I deal with similar problems, I will make better use of their respective advantages.