
This is a mini project for practicing with Twitter API, data wrangling, web scrapping

Primary LanguageJupyter Notebook

Project Introduction

This is a Udacity project set to demonstrate data gatherig and data wrangling skills.

Data sets:

  • Twitter archive of WeRateDogs tweets (provided by Udacity)
  • Twitter original data will be gathered via Tweepy API using a Tweeter developer account. This data will contain retweet and favourite counts that are omitted in the provided WeRateDogs Twitter archive.
  • Image prediction of pictures on these tweets: this is a Udacity dataset that is created as part of their Machne Learning nanodegree

The project will involve cleaning those datasets to meet tidy data standards. [https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html]

Then analysis on the master dataset of the merged datasets above will be performed.

Project Steps:

  • Step 1: Gathering data
  • Step 2: Assessing data
  • Step 3: Cleaning data
  • Step 4: Storing data
  • Step 5: Analyzing, and visualizing data
  • Step 6: Reporting


  • Main notebook with code implementing the project steps: -- wrangle_act.ipynb
  • documentation for data wrangling steps: gather, assess, and clean: -- wrangle_report.pdf
  • documentation of analysis and insights into final data: -- act_report.pdf