joshsungasong/Project-3-Reddit-Post-Data-Analysis
This project allowed me to hone my web scraping and natural language processing skills by scraping data from Reddit. I collected a multitude of Reddit post attributes including the titles, number of upvotes, number of comments and subreddits attached with each post. I used natural language processing to create features out of the scraped data to input them into a Random Forest classifier and Logistic Regression to classify what would go into a popular Reddit post based on the collected features. The project involved a real-life scenario and the FiveThirtyEight team as the client.
Jupyter Notebook
Watchers
No one’s watching this repository yet.