x4nth055/pythoncode-tutorials

extract_video_info.py is broken

mattpopovich opened this issue · 3 comments

Specifically, I believe that yt-formatted-string is no longer populated, breaking everything from "likes" and below.

I expect to see something similar to this as noted here

$ python extract_video_info.py https://www.youtube.com/watch?v=jNQXAC9IVRw
Title: Me at the zoo
Views: 172639597
Published at: 2005-04-23
Video Duration: 0:18
Video tags: me at the zoo, jawed karim, first youtube video
Likes: 8188077
Dislikes: 191986

Description: The first video on YouTube. While you wait for Part 2, listen to this great song: https://www.youtube.com/watch?v=zj82_v2R6ts


Channel Name: jawed
Channel URL: https://www.youtube.com/channel/UC4QobU6STFB0P71PMvOGN5A
Channel Subscribers: 1.98M subscribers

But I instead get this :(

mattpopovich@MBP $ python3 extract_video_info.py 'https://www.youtube.com/watch?v=jNQXAC9IVRw'
Traceback (most recent call last):
  File "/Users/mattpopovich/Documents/extract_video_info.py", line 58, in <module>
    data = get_video_info(url)
  File "/Users/mattpopovich/Documents/extract_video_info.py", line 32, in get_video_info
    result["likes"] = ''.join([ c for c in text_yt_formatted_strings[0].attrs.get("aria-label") if c.isdigit() ])
IndexError: list index out of range

I'm looking into it... Just wanted to document it now for the time being.

Looks like the ytInitialData variable has a "defaultText":{"accessibility":{"accessibilityData":{"label":"##### likes"}} which might be promising

A temporary fix/hack for likes:

import re 
data = re.search(r"var ytInitialData = ({.*?});", soup.prettify()).group(1)
data_json = json.loads(data)
likes_label = data_json['contents']['twoColumnWatchNextResults']['results']['results']['contents'][0]['videoPrimaryInfoRenderer']['videoActions']['menuRenderer']['topLevelButtons'][0]['toggleButtonRenderer']['defaultText']['accessibility']['accessibilityData']['label']
result["likes"] = int(likes_label.split(' ')[0])

Not sure if there's anything we can do to get dislikes anymore... Hopefully there's an easier way to get likes.

I can make a PR if you'd like, just let me know.

Thanks, @mattpopovich. I mentioned your contribution in the tutorial! Check it out: https://www.thepythoncode.com/article/get-youtube-data-python

Appreciate the shoutout, @x4nth055!