404 when trying to iterate through streams provided by YouTube/Video object
Closed this issue · 3 comments
jslay88 commented
Some videos have streams that are returning 404 when trying to access attributes on the stream.
For instance, this video here https://www.youtube.com/watch?v=3ejPqYn1gOY
python.exe C:/Scripts/YouTubeDownload/qt_gui.py
get video id
3ejPqYn1gOY
Worker 0: Loading video
Worker 0: 33532: thread starting...
loading streams
Worker 0: Loading stream 0
Worker 0: Loading streams for video 0
Worker 0: Loaded video: checkra1n on Raspberry PI
Worker 0: Loaded stream 0
Worker 0: Loading stream 1
Worker 0: Loading stream 2
Worker 0: Loaded stream 1
Worker 0: Loading stream 3
Worker 0: Loaded stream 2
Traceback (most recent call last):
File "C:\Scripts\YouTubeDownload\qt_assets\tabs\downloader.py", line 139, in load_streams
f'Res: {stream.resolution}, FPS: {stream.fps}, '
File "C:\Scripts\YouTubeDownload\venv\lib\site-packages\pytube\streams.py", line 143, in filesize
headers = request.head(self.url)
File "C:\Scripts\YouTubeDownload\venv\lib\site-packages\pytube\request.py", line 57, in head
response_headers = _execute_request(url, method="HEAD").info()
File "C:\Scripts\YouTubeDownload\venv\lib\site-packages\pytube\request.py", line 19, in _execute_request
return urlopen(request) # nosec
File "C:\Python37\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Python37\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Python37\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python37\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Python37\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Python37\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
<class 'urllib.error.HTTPError'> HTTP Error 404: Not Found <traceback object at 0x0000026452469748>
>>> from pytube3 import YouTube
>>> yt = YouTube('https://www.youtube.com/watch?v=3ejPqYn1gOY&feature=youtu.be')
>>> for stream in yt.streams.all():
... print(
... f'Codec: {stream.audio_codec}, '
... f'ABR: {stream.abr}, '
... f'File Type: {stream.mime_type.split("/")[1]}, '
... f'Size: {stream.filesize // 1024} KB'
... )
...
Codec: mp4a.40.2, ABR: 96kbps, File Type: mp4, Size: 606 KB
Codec: mp4a.40.2, ABR: 192kbps, File Type: mp4, Size: 1919 KB
Codec: None, ABR: None, File Type: mp4, Size: 1714 KB
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "C:\Scripts\YouTubeDownload\venv\lib\site-packages\pytube\streams.py", line 143, in filesize
headers = request.head(self.url)
File "C:\Scripts\YouTubeDownload\venv\lib\site-packages\pytube\request.py", line 57, in head
response_headers = _execute_request(url, method="HEAD").info()
File "C:\Scripts\YouTubeDownload\venv\lib\site-packages\pytube\request.py", line 19, in _execute_request
return urlopen(request) # nosec
File "C:\Python37\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Python37\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Python37\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python37\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Python37\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Python37\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
hbmartin commented
@jslay88 I'll take a look... it's happening because the filesize property has to be retrieved from a network call.
hbmartin commented
- Looks like @movraxrsp and @swiftyy-mage did some digging into this issue a while back: pytube#543
- I took what they found and pushed a PR here: https://github.com/hbmartin/pytube3/pull/48
- The above PR lets you filter these streams out (see example in link)
- Also, for listing lots streams size, please use the new
filesize_approx
property which is very accurate and avoid HTTP call overhead offilesize
- I will continue investigating the possibility of decrypting these URLs to a non-error
jslay88 commented
Sweet. Will take a look probably tomorrow. Pretty slammed with work at the moment. Thanks for your efforts.