Project Introduction
This is a Udacity project set to demonstrate data gatherig and data wrangling skills.
Data sets:
- Twitter archive of WeRateDogs tweets (provided by Udacity)
- Twitter original data will be gathered via Tweepy API using a Tweeter developer account. This data will contain retweet and favourite counts that are omitted in the provided WeRateDogs Twitter archive.
- Image prediction of pictures on these tweets: this is a Udacity dataset that is created as part of their Machne Learning nanodegree
The project will involve cleaning those datasets to meet tidy data standards. [https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html]
Then analysis on the master dataset of the merged datasets above will be performed.
Project Steps:
- Step 1: Gathering data
- Step 2: Assessing data
- Step 3: Cleaning data
- Step 4: Storing data
- Step 5: Analyzing, and visualizing data
- Step 6: Reporting
Files:
- Main notebook with code implementing the project steps: -- wrangle_act.ipynb
- documentation for data wrangling steps: gather, assess, and clean: -- wrangle_report.pdf
- documentation of analysis and insights into final data: -- act_report.pdf