/twitter-wrangle-data

Gather, assess, and clean data from WeRateDogs Twitter account and present analysis and visualizations

Primary LanguageJupyter Notebook

WeRateDogs Twitter Data Analysis

Introduction

WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. For this project, I will gather data from a variety of sources and in a variety of formats related to this Twitter account. Then, assess the quality and tidiness, and clean it. Finally, wrangling efforts will be showcased through analyses and visualizations.

Libraries

  • pandas
  • NumPy
  • Requests
  • tweepy
  • Matplotlib
  • JSON
  • csv, tsv

Future Improvements

  • Create breed and confidence algorithm and remove prediction columns
  • Pull dog gender from text column by looping for pronouns of males and females
  • Separate or remove instances where two dog stages are listed for one tweet e.g. 'doggo,pupper' and 'doggo,floofer' etc.
  • Compare retweets and favorites data
  • Improve legibility of pie chart