JuanBindez/pytubefix

'urllib.error.HTTPError: HTTP Error 403: Forbidden' if download takes more than 30 seconds

Closed this issue ยท 7 comments

This might be related to the other similar issue (#12) JuanBindez submitted about 3 weeks ago.

I have tested this using both the CLI and the Python library. It seems that no matter which YouTube video stream you choose, if the download takes more than 30 seconds, it raises an 'HTTP Error 403' Exception, causing the download to fail.
I have obviously taken into account the other issue, so I made sure to use a short video and to include use_oauth=True and allow_oauth_cache=True in the Python script to test this out.

I have no idea how the download of streams is handled by pytube, and if internet speed plays a role in it, so the next commands might not work for everyone.
In my case, these commands will raise the HTTP Exception exactly 30 seconds after being executed:

CLI:

pytubefix https://www.youtube.com/watch?v=E8gmARGvPlI --itag 313

Python script:

from pytubefix import YouTube
from pytubefix.cli import on_progress

yt = YouTube(
	'https://www.youtube.com/watch?v=E8gmARGvPlI',
	use_oauth=True,
	allow_oauth_cache=True,
	on_progress_callback=on_progress,
)

# <Stream: itag="313" mime_type="video/webm" res="2160p" fps="25fps" vcodec="vp9" progressive="False" type="video">

uhd_webm_video_only = yt.streams.get_by_itag(313)
uhd_webm_video_only.download()

do a pytubefix https://www.youtube.com/watch?v=E8gmARGvPlI --list, those with progressive="False" will not work

I tried to provide a complete rundown of every stream for that video. Here it is:

Progressive:

<Stream 17 mime="video/3gpp" res="144p" fps="6fps"  vcodec="mp4v.20.3"   acodec="mp4a.40.2"> # Works, took less than 30s
<Stream 18 mime="video/mp4"  res="360p" fps="25fps" vcodec="avc1.42001E" acodec="mp4a.40.2"> # Works, took less than 30s
<Stream 22 mime="video/mp4"  res="720p" fps="25fps" vcodec="avc1.64001F" acodec="mp4a.40.2"> # Works, took less than 30s

Not Progressive:

Video:

Mime 'video/webm' @ 25fps, vcodec 'vp9'

<Stream 313 res="2160p">	# Doesn't work, HTTP 403 after 30s
<Stream 271 res="1440p">	# Doesn't work, HTTP 403 after 30s
<Stream 248 res="1080p">	# Works, took less than 30s
<Stream 247 res="720p">		# Works, took less than 30s
<Stream 244 res="480p">		# Works, took less than 30s
<Stream 243 res="360p">		# Works, took less than 30s
<Stream 242 res="240p">		# Works, took less than 30s
<Stream 278 res="144p">		# Works, took less than 30s

Mime 'video/webm' @ 13fps, vcodec 'vp9'

<Stream 598 res="None">	# Works, took less than 30s

Mime 'video/mp4' @ 25fps, vcodec 'avc1.******'

<Stream 137 res="1080p">	# Works, took less than 30s
<Stream 136 res="720p">		# Works, took less than 30s
<Stream 135 res="480p">		# Works, took less than 30s
<Stream 134 res="360p">		# Works, took less than 30s
<Stream 133 res="240p">		# Works, took less than 30s
<Stream 160 res="144p">		# Works, took less than 30s

Mime 'video/mp4' @ 13fps, vcodec 'avc1.4d400b'

<Stream 597 res="None">		# Works, took less than 30s

Mime 'video/mp4' @ 25fps, vcodec 'av01.0.**M.08'

<Stream 401 res="2160p">	# Doesn't work, HTTP 403 after 30s
<Stream 400 res="1440p">	# Works, took less than 30s
<Stream 399 res="1080p">	# Works, took less than 30s
<Stream 398 res="720p">		# Works, took less than 30s
<Stream 397 res="480p">		# Works, took less than 30s
<Stream 396 res="360p">		# Works, took less than 30s
<Stream 395 res="240p">		# Works, took less than 30s
<Stream 394 res="144p">		# Works, took less than 30s

Audio:

Mime 'audio/mp4', acodec 'mp4a.40.*'

<Stream 599 abr="None">		# Works, took less than 30s
<Stream 139 abr="48kbps">	# Works, took less than 30s
<Stream 140 abr="128kbps">	# Works, took less than 30s

Mime 'audio/webm', acodec 'opus'

<Stream 249 abr="50kbps">	# Works, took less than 30s
<Stream 250 abr="70kbps">	# Works, took less than 30s
<Stream 251 abr="160kbps">	# Works, took less than 30s
<Stream 600 abr="None">		# Works, took less than 30s

In the end, only the streams that would have taken me more than 30 seconds to download(313, 271, 401) failed.
Everything else worked.

I have been having the same issues trying to download long lectures (whether authenticated or unauthenticated).

Thought it was worth noting that grabbing the .url property of the stream and just passing it into curl -O "<the url>" works just fine. Of course it saves it as videoplayback, so you have to rename the file once it downloads.

Similarly, dropping the URL into the browser and just saving the file there seems to work fine (without being logged in, etc). This leads me to believe that the problem is something specific to the package's download mechanism causing the disconnects.

My best guess is that the URL provided by Google of a video has a time window of ~30 seconds in which it can accept GET requests.
After that time window, it just closes the URL and returns a 403 response for all subsequent requests.

The library downloads a video by streaming it, that is it sends GET requests for chunks of 9MB at a time, so if one of those requests happens after that time window, the URL is closed by Google and it can't continue to download the video.

I am currently trying to find a workaround, if I find anything useful I will post it here.

My best guess is that the URL provided by Google of a video has a time window of ~30 seconds in which it can accept GET requests. After that time window, it just closes the URL and returns a 403 response for all subsequent requests.

The library downloads a video by streaming it, that is it sends GET requests for chunks of 9MB at a time, so if one of those requests happens after that time window, the URL is closed by Google and it can't continue to download the video.

I am currently trying to find a workaround, if I find anything useful I will post it here.

I would like you to test this version and let me know if it worked -> pip install pytubefix==1.10rc1

I tried it and it works now, no error even if the download takes more than 30 seconds.
May I ask you what the problem was?

I tried it and it works now, no error even if the download takes more than 30 seconds. May I ask you what the problem was?

was a contributor -> Fixed 403 Forbidden error when using ANDROID client #21, I will release the version with this correction soon