/sadge-pub

Scripts for enriching Google Takeout YouTube watch-history.html data.

Primary LanguagePython

sadge-pub

Python script for getting more data from Google Takeout's YouTube watch-history.html

A one stop Python script that lets you go from Google Takeout's terrible dataset to one populated with video metadata, keywords, and channel insights


Instructions:

  1. Go to (Google Takeout)[https://takeout.google.com/settings/takeout]

  2. Scroll to the bottom and check YouTube and YouTube Music

  3. Click All data included and only select history

  4. Next Step -> Create Export

  5. Download the .zip from Gmail

  6. Run the following:

git clone https://github.com/tyler-keller/sadge-pub.git`
  1. Unpack the .zip and move the watch-history.html file to the ./sadge-pub/data/ directory

  2. Run the following:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python3 sadge_scraper.py

Note: length of time to completion subject to watchtimes and Google's rate limiter.