JSC370_Project_Youtube_data_analysis

YouTube, the world’s third most popular online destination, has transformed from a video-sharing site into a job opportunity for content creators in both new and mainstream media. (cite: https://www.elon.edu/u/academics/communications/journal/wp-content/uploads/sites/153/2017/06/06_Margaret_Holland.pdf) Individuals who upload videos on Youtube, also known as YouTubers, could turn on monetization features. One of the major ways YouTubers earn money is through the number of ad views (https://support.google.com/youtube/answer/72857?hl=en). Since ad views depend on each video’s views, we would like to analyze what factors could result in a high view and how people’s preferences have changed in recent years.

Instruction

Download CA_youtube_trending_data.csv from https://www.kaggle.com/rsrishav/youtube-trending-video-dataset.
Download CAvideos.csv from https://www.kaggle.com/datasnaek/youtube-new?select=CAvideos.csv
Download CA_category_id.json from https://www.kaggle.com/datasnaek/youtube-new (I have included in the repository)
Run insert_words.py and it will produce CAvideos2.csv
Run mid_Part1_final.Rmd inside R studio to produce the same result (an html file).

Note that I also included my API call result All_ca_trend_vid_content.csv, so you do not have to obtain a key and run the API for a long time to obtain the additional data.

Hantang-Li/JSC370_Project_Youtube_data_analysis

JSC370_Project_Youtube_data_analysis

Instruction