/reddis_data_viz

Data science work for RedditInsight

Primary LanguagePython

reddis_data_viz

Data science work for RedditInsight.

  1. Segmented data by subreddit
  2. Used NLTK to separate the words in titles by their parts of speech
  3. Developed frequency analysis of nouns by subreddit
  4. Munged dataset for predictive model- extracted day of week, and hour of day the post was created. Developed categorical variable out of the subreddit and domain features.
  5. Evaluated predictive value of model, decided to focus on data visualizations.
  6. Developed clustering analysis of subreddit data for subreddits that had natural topic segmentation.
  7. Developed noun frequency analysis by subreddit
  8. Visualizations created from this work are in- https://github.com/sheltowt/redditD3