Use a format fallback or allow to manually set the video format
alexislours opened this issue · 4 comments
The current format used to download videos is notoriously error prone (see yt-dlp/yt-dlp#3372).
One solution would be for yark to fallback to webm in such cases, to ask the user to manually pick a format from the yt-dlp -F
output or to have an option to pass the following string as a CLI argument https://github.com/Owez/yark/blob/c48e37ae405052bb443b04098d088ccc8e071b4e/yark/channel.py#LL209C10-L209C10
The later would also allow to save channels at higher resolution since YouTube only serves mp4 with audio up to 720p resolution.
The main drawback is that the video file will generally be larger in such cases.
Example of a video affected by it: https://www.youtube.com/watch?v=YbYpbXMUsYM
yt-dlp error:
yt-dlp -f "best/[ext=mp4]/hasvid" "https://www.youtube.com/watch?v=YbYpbXMUsYM" -o YbYpbXMUsYM.mp4
[youtube] Extracting URL: https://www.youtube.com/watch?v=YbYpbXMUsYM
[youtube] YbYpbXMUsYM: Downloading webpage
[youtube] YbYpbXMUsYM: Downloading android player API JSON
[info] YbYpbXMUsYM: Downloading 1 format(s): 22
[download] Resuming download at byte 1713408
ERROR: Did not get any data blocks
yark error:
yark refresh munecat
Loading munecat channel..
Downloading metadata..
Parsing video metadata..
Parsing livestream metadata..
Parsing shorts metadata..
Cleaning out previous temporary files..
Downloading 19 new videos..
• Downloading YbYpbXMUsYM, at 0.2%..
• Unknown error whilst downloading videos, details below:
[download] Got error: Downloaded 1713408 bytes, expected 780994664 bytes, retrying in a few seconds..
• Fault with YouTube's servers, retrying in a few seconds..
• Unknown error whilst downloading videos, details below:
ERROR: Did not get any data blocks, retrying in a few seconds..
• Unknown error whilst downloading videos, details below:
ERROR: Did not get any data blocks, retrying in a few seconds..
• Unknown error whilst downloading videos, details below:
ERROR: Did not get any data blocks
• Sorry, failed to download {name}
Please file a bug report if you think this is a problem with Yark!
Good to know, the custom format argument is a good idea. I'd like to prioritise the best format or a high-quality one by default (if possible) and if not, any format that works. The size of the video file is an alright drawback as long as users have that argument option to use lower-quality videos if they need to.
Prioritizing the best format would just be a matter of not setting the format
when starting the download with yt-dlp. But given the project uses yt-dlp without FFMPEG, it will grab the best format that has audio and video as a single file since it can't merge them without FFMPEG.
For example, out of the available formats for the video I linked:
ID EXT RESOLUTION FPS CH │ FILESIZE TBR PROTO │ VCODEC VBR ACODEC ABR ASR MORE INFO
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────
sb2 mhtml 48x27 0 │ mhtml │ images storyboard
sb1 mhtml 80x45 0 │ mhtml │ images storyboard
sb0 mhtml 160x90 0 │ mhtml │ images storyboard
599 m4a audio only 2 │ 14.65MiB 31k https │ audio only mp4a.40.5 31k 22k ultralow, m4a_dash
600 weba audio only 2 │ 16.93MiB 36k https │ audio only opus 36k 48k ultralow, weba_dash
139 m4a audio only 2 │ 23.22MiB 49k https │ audio only mp4a.40.5 49k 22k low, m4a_dash
249 weba audio only 2 │ 24.80MiB 52k https │ audio only opus 52k 48k low, weba_dash
250 weba audio only 2 │ 32.58MiB 68k https │ audio only opus 68k 48k low, weba_dash
140 m4a audio only 2 │ 61.62MiB 129k https │ audio only mp4a.40.2 129k 44k medium, m4a_dash
251 weba audio only 2 │ 61.33MiB 129k https │ audio only opus 129k 48k medium, weba_dash
17 3gp 176x144 6 1 │ 36.14MiB 76k https │ mp4v.20.3 76k mp4a.40.2 0k 22k 144p
597 mp4 256x144 13 │ 8.21MiB 17k https │ avc1.4d400b 17k video only 144p, mp4_dash
598 webm 256x144 13 │ 8.85MiB 19k https │ vp9 19k video only 144p, webm_dash
394 mp4 256x144 25 │ 29.50MiB 62k https │ av01.0.00M.08 62k video only 144p, mp4_dash
160 mp4 256x144 25 │ 19.16MiB 40k https │ avc1.4d400c 40k video only 144p, mp4_dash
278 webm 256x144 25 │ 31.33MiB 66k https │ vp9 66k video only 144p, webm_dash
395 mp4 426x240 25 │ 37.51MiB 79k https │ av01.0.00M.08 79k video only 240p, mp4_dash
133 mp4 426x240 25 │ 40.47MiB 85k https │ avc1.4d4015 85k video only 240p, mp4_dash
242 webm 426x240 25 │ 40.88MiB 86k https │ vp9 86k video only 240p, webm_dash
396 mp4 640x360 25 │ 72.51MiB 152k https │ av01.0.01M.08 152k video only 360p, mp4_dash
134 mp4 640x360 25 │ 79.34MiB 167k https │ avc1.4d401e 167k video only 360p, mp4_dash
18 mp4 640x360 25 2 │ 215.72MiB 453k https │ avc1.42001E 453k mp4a.40.2 0k 44k 360p
243 webm 640x360 25 │ 91.86MiB 193k https │ vp9 193k video only 360p, webm_dash
397 mp4 854x480 25 │ 128.15MiB 269k https │ av01.0.04M.08 269k video only 480p, mp4_dash
135 mp4 854x480 25 │ 125.47MiB 264k https │ avc1.4d401e 264k video only 480p, mp4_dash
244 webm 854x480 25 │ 145.65MiB 306k https │ vp9 306k video only 480p, webm_dash
22 mp4 1280x720 25 2 │ ~762.64MiB 1565k https │ avc1.64001F 1565k mp4a.40.2 0k 44k 720p
398 mp4 1280x720 25 │ 259.04MiB 544k https │ av01.0.05M.08 544k video only 720p, mp4_dash
136 mp4 1280x720 25 │ 194.56MiB 409k https │ avc1.4d401f 409k video only 720p, mp4_dash
247 webm 1280x720 25 │ 269.18MiB 566k https │ vp9 566k video only 720p, webm_dash
399 mp4 1920x1080 25 │ 477.27MiB 1003k https │ av01.0.08M.08 1003k video only 1080p, mp4_dash
137 mp4 1920x1080 25 │ 652.13MiB 1370k https │ avc1.640028 1370k video only 1080p, mp4_dash
248 webm 1920x1080 25 │ 489.04MiB 1028k https │ vp9 1028k video only 1080p, webm_dash
400 mp4 2560x1440 25 │ 1.51GiB 3246k https │ av01.0.12M.08 3246k video only 1440p, mp4_dash
271 webm 2560x1440 25 │ 1.42GiB 3051k https │ vp9 3051k video only 1440p, webm_dash
401 mp4 3840x2160 25 │ 3.26GiB 7008k https │ av01.0.12M.08 7008k video only 2160p, mp4_dash
313 webm 3840x2160 25 │ 4.02GiB 8645k https │ vp9 8645k video only 2160p, webm_dash
In this case, yt-dlp
will grab the format id 22 since it's the best with audio and video as a single file, but it is only 720p. If FFMPEG is installed, it would instead grab format 313 for video and format 251 for audio and merge them as a single webm file.
I'm not sure if this is possible to make the Python package aware of a system install of FFMPEG for this to work in yark. I also think this would require some changes in the web server and the checks for a video already downloaded since MP4 format is assumed.
Yeah ffmpeg might be annoying to download. I'll patch now and figure out implementing FFMPEG in 1.3 because videos being limited at a 720p isn't great.
I'm fine with the archiver using any popular format, probably whatever the native html <video>
tag supports as a general benchmark.
You could just opportunistically use ffmpeg
if it's already installed on the PATH, otherwise keep doing it the current way.